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Preface 


Financial technology is reshaping the financial industry ecology with explosive 
growth, making China's financial industry constantly achieve breakthroughs on a 
new runway. The rapid development of FinTech at the same time triggers the indepth 
integration between mathematics, finance, and advanced technology. 

The Second International Academic Forum on Financial Mathematics and Finan- 
cial Technology was successfully held online on August 13-15, 2021, jointly held 
by the School of Mathematics of Renmin University of China, the Engineering 
Research Center of the Ministry of Financial Computing and Digital Engineering, 
the Statistics and Big Data Research Institute of Renmin University of China, the 
Blockchain Research Institute of Renmin University of China, the Zhong guancun 
Internet Finance Research Institute, and the Renmin University Press. Several distin- 
guished scholars engaged in the interdisciplinary research of mathematics, statistics, 
information technology, and finance delivered excellent speeches and discussed in 
depth on the bottlenecks faced by emerging technologies such as big data, AL, cloud 
computing, and blockchain. This forum has provided insightful understandings on 
the development frontier and research hotspot of financial mathematics and financial 
technology, and strengthened the contact between our institute and research institutes 
from home and abroad. 

The proceedings emphasize the selected aspects of current and upcoming trends 
in FinTech, presenting the innovative mathematical models and state-of-the-art tech- 
nologies, benefiting both scholars and practitioners in pursuing perfect integration 
of elegant mathematical models and up-to-date data mining technologies in financial 
market analysis. 

Chapter “On the Development of Fintech in Asia" provides the general overview 
on the Development of Fintech in Asia. Chapter “A Probability Inequality with Appli- 
cation to Lattice Theory" gives a more precise estimation probability of decryption 
error about GGH public-key encryption scheme based on the Hoeffding inequality. 
The upper bound probability could be closed to 0 with applicable parameters, which 
means that the probability of decryption error for the cryptosystem could be suffi- 
ciently small. It is also confirmed that the GGH public-key cryptosystem could 
have high security. Chapter “Robust Identification of Gene-Environment Interactions 
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Under High-Dimensional Accelerated Failure Time Models" considers censored 
survival data and adopt a high-dimensional accelerated failure time (AFT) model for 
robust identification of gene-environment interactions. Chapter *A Novel Approach 
for Improving Accuracy for Distributed Storage Networks" propose two approaches, 
periodic self-verification and user verification, to guarantee the reliability of the 
storage network while improving efficiency in distributed storage. Chapter “Iter- 
ative Learning Control Based on Random Variance Reduction Gradient Method" 
proposed a novel iterative learning control scheme based on stochastic variance 
reduced gradient (SVRG), which is not only suitable for resolving the incomplete 
information problem, but also converges efficiently under both strongly convex and 
non-strongly convex control objectives. Chapter “A Generalization of NTRUEn- 
crypt" first discusses a more general form of the ordinary cyclic code and gives a 
generalized construction of NTRU based on ideal matrix and q-ary lattice theory. 
Compared with other variations of NTRU, such as CTRU, GNTRU, QTRU, and 
BITRU, the extended NTRU cryptosystem is constructed with general ideal matrix 
rather than some special algebraic structures. Chapter “Cyclic Lattices, Ideal Lattices, 
and Bounds for the Smoothing Parameter" shows that ideal lattices are actually 
a special subclass of cyclic lattices, and proves that there is a one-to-one corre- 
spondence between cyclic lattices and finitely generated R-modules. Chapter “On 
the LWE Cryptosystem with More General Disturbance" gives estimation probability 
of decryption error based on Gaussian disturbances and proves that the decryption 
error could be sufficiently small. The most salient innovation and contribution is 
that for any general disturbances, the decryption error could also be small enough. 
This indicates high security and reliability of LWE-based cryptosystem. In other 
words, this cryptosystem is secure enough against passive eavesdroppers and could 
be applied in many kinds of encryption process. Chapter “On the High Dimen- 
sional RSA Algorithm—A Public Key Cryptosystem Based on Lattice and Algebraic 
Number Theory” proves that high-dimensional RSA is a lattice based on public-key 
cryptosystem, of which would be considered as a new number in the family of post- 
quantum cryptography. Moreover, the matrix expression of any algebraic number 
field is also given, which is a new result even in the sense of classical algebraic number 
theory. Chapter “Central Bank Digital Currency Cross-Border Payment Model Based 
on Blockchain Technology” combines the time-series model with fiscal science and 
puts forward a model for the fiscal budget variance of China’s national general public 
budget. Chapter "LLE Based K-Nearest Neighbor Smoothing for scRNA-Seq Data 
Imputation” proposed LLE-based k-nearest neighbor smoothing for scRNA-seq data 
imputation where the data is of high dimensionality, sparse and noisy. Chapter “The 
Application of Time Series Analysis in the Fiscal Budget Variance of China” is about 
the application of time-series analysis in the fiscal budget variance of China. 

We would like to take this opportunity to thank all the participants at the second 
International Forum on Financial Mathematics and FinTech. We are also pleased to 
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thank the support of School of Mathematics, Renmin University of China, and Engi- 
neering Research Center of Finance Computation and Digital Engineering, Ministry 


of Education. 


Beijing, China Zhiyong Zheng 
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Abstract There are five models of fintech development in the world: the technol- 
ogy promotion model represented by the USA, the rule-driven model represented by 
the UK, the market pull model represented by China, the mixed competition model 
represented by Japan and Indonesia, and the model of fanning out from point to area 
represented by South Korea and Israel. In terms of the layout, the transformation of 
traditional financial hubs has been accelerated, China and the USA have outstand- 
ing advantages in fintech, and the Asia-Pacific region has great potential for fintech 
development. The fintech of China has been promoted to the worlds leading level; 
Japan boosts the rapid growth of fintech through advantages of backwardness; Sin- 
gapore gathers innovative resources with a relaxed and inclusive atmosphere; South 
Korea promotes scale development of fintech industry by fanning out from point to 
area; India is gradually exerting its potential for fintech development; Israel builds 
the highland of fintech development through guidance plus service; Indonesia has 
gradually become a rising star in fintech development in Southeast Asia; Hong Kong 
promotes the momentum of sound fintech development with government assistance. 


Keywords Fifintech development * Asia * Policy and regulatory measures - 
Digital transformation 


1 Overview of Global Fintech Development 


In recent years, global fintech has maintained a high speed of development, the adop- 
tion rate of fintech has gradually increased, and a large number of fintech unicorn 
enterprises have emerged. With the application of big data, blockchain, AI, and other 
technologies in the financial field becoming more and more mature, new models and 
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industry forms of financial service have come into being. Among them, some appli- 
cation fields have developed more rapidly including digital currency, open banking, 
digital banking, etc. 


1.1 Development Dynamics 


Fintech enterprises are growing fast. According to the relevant data, there are 1057 
unicorn enterprises in the world now as of November 2021, and fintech unicorns play 
a decisive role in fintech field with the most amount of enterprises on the list which 
is 139 and the total valuation is 4.7 trillion yuan, accounting for 1996 of the total 
valuation of unicorn enterprises on the list. From a country perspective, the USA has 
the largest number of unicorn enterprises in the fintech sector, followed by China. 
In 2019 Fintech 100 announced by Klynveld Peat Marwick Goerdeler (KPMG), the 
enterprises in Asia-Pacific region (including Australia and New Zealand) performed 
brilliantly, with a total of 42 enterprises on the list. As far as payment enterprises 
were concerned, 27 companies were on the list, which took the lead. As for other 
categories of companies on the list, there were 19 wealth management companies, 
17 insurance companies, 15 lending companies, and 13 companies with relatively 
comprehensive financial business. 

The developing economies represented by Southeast Asia and Latin America have 
obvious development characteristics in the field of financial science and technology. 
According to the report of the Future of Southeast Asian fintech by the British con- 
sultancy Dealroom, European venture capital company FinchCapital and Indonesian 
venture capital company MDIVentures, the outbreak of COVID-19 pneumonia has 
accelerated the digital transformation of fintech in the region, especially in the field 
of digital payment. Indonesia is expected to become the largest financial technology 
hub in the region by 2025, with an expected market value of US $130 billion in 
related fields. According to the global fintech report for the second quarter of 2021 
by CB Insights, fintech financing in Latin America has increased at a compound 
annual growth rate of 57% since 2016, reaching US $4.246 billion by the second 
quarter of 2021. Among them, the financing amount of fintech companies in Brazil 
alone accounts for 70% of the total financing in the region. 

More and more central banks have begun to actively study the issuance of CBDC 
(Central Bank Digital Currency), and some countries have even begun to build the 
underlying infrastructure of CBDC and start the pilot of CBDC technology. As the 
first country in the world to launch a sovereign digital currency, DC/EP has conducted 
pilot projects in some domestic cities, commercial banks, and cross-border payments 
since April 2020, and completed the country’s first digital RMB insurance policy in 
December 2020. In 2020, the Bank of France launched a digital currency pilot project. 
European and American countries are also unwilling to fall behind. The central banks 
of Canada, Sweden, the UK, and other countries jointly set up a CBDC group with 
BIS. In May 2020, the United States released a white chapter on the digital dollar 
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project (DDP), which introduced in detail the basic architecture, distribution purpose, 
and potential application scenarios of CBDC in the United States. 

The evolution of digital banking is accelerating. As a banking development model 
which has arisen in recent years, digital banking is an important achievement of digi- 
tal transformation of banks. Currently, 60% of the worlds banking population is using 
digital banking through online services and cashless transactions. According to the 
relevant data in the Nets (an European transaction processing center) report, non- 
contacting digital wallet transactions increased by more than two-thirds in the first 
half of 2020 compared with 2019. With the increase of the users of digital banking, 
the number of digital banks has gradually increased. In 2019, Hong Kong Mone- 
tary Authority (HKMA) approved the establishment of 8 virtual banks. Monetary 
Authority of Singapore (MAS) opened up applications for digital banking licences in 
2020. In addition, digital banks in many countries engage in online banking business 
with traditional banking license or in traditional authorized business forms, such as 
Monzo Bank and N26 Bank in the UK, aiBank, WeBank, and MYbank, in China. 

The world has a deeper understanding of the concept of sustainable development, 
and the practice scenes of fintech in the field of green finance have increased. From the 
perspective of application scenarios, the use of fintech tools covers ESG investment 
and financing, national carbon market trading, green building, green consumption, 
green agriculture, small and micro enterprises, and other fields. Fintech is widely 
used in environmental data, ESG data and evaluation, green credit information man- 
agement system of financial institutions, and other scenarios. 


1.2 The Financing Profile 


Global fintech investment and financing grew strongly. In 2020, the number of financ- 
ing transactions reached 3443, and the number of financing transactions in the first 
three quarters of 2021 was 3549, which has exceeded the total amount of financing 
in the whole year of last year. The total financing amount of fintech in 2020 was US 
$48.4 billion, and the amount of financing in the first three quarters of 2021 was US 
$94.7 billion, nearly twice the total financing amount of last year. 

The financing amount of financing projects is mainly concentrated in North Amer- 
ica, Asia, and Europe, with a quarter on quarter increase of more than 5096. Among 
them, North America has the highest amount of total financing, accounting for more 
than half of the total global investment, reaching the highest in the second quarter 
of this year, with USD 16.56 billion, followed by Asia, which reached the highest 
in the third quarter of this year, with $5.9 billion. South America exceeded USD 1 
billion for the first time in the second quarter of this year. In Africa and Oceania, the 
amount of financing is relatively stable and has little change (Fig. 1). 


Fig. 1 The amount of 
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1.3 Regulatory Environment 


In recent years, financial management departments in various economies have 
increasingly improved their regulation on fintech activities, and have promoted the 
healthy and orderly development of fintech through measures such as continuous 
monitoring, the establishment of regulators, and the introduction of regulatory poli- 
cies. On the one hand, financial management departments support the entrance of 
fintech companies into the market to make up for the current weak links in financial 
services; on the other hand, countries have set a high threshold for access to financial 
business to reasonably guard against systemic risks. 

The legislative process of data protection has been accelerated. In recent years, 
with the iterative innovation of new technologies, various business entities are accel- 
erating the development of new data resources, and meanwhile the incurred problems 
such as data privacy protection are also increasingly valued by various countries. EU 
countries summarize and improve data legislation in practice: since the second half 
of 2019, the European Commission (EC) and Council of the European Union have 
organized each member countrys regulators to submit a law enforcement summary, 
and they have received 19 law enforcement summary reports from different coun- 
tries. In September 2020, European Data Protection Board (EDPB) issued Guidelines 
on the Targeting of Social Media Users (the Draft Guidelines), expounding on the 
requirements of data protection in social media. At the beginning of 2020, Califor- 
nia Consumer Privacy Act (CCPA) of the USA formally came into force and was 
formally incorporated into Californias judicial system. On October 21, 2020, the 
Peoples Republic of China released (Draft) and solicited public opinions. It is the 
first law that specifically stipulates personal information protection. Promulgated, it 
will become the basic law in the field of personal information protection, the Personal 
Information Protection Law of the Peoples Republic of China, officially came into 
force on November 1, 2021. 

The innovation of fintech regulation tools has been continuously strengthened. 
Firstly, some countries have established fintech innovation mechanism. To cite a few 
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examples, France proposed in March 2021 to establish a European exemption mech- 
anism in regard to blockchain, relax some legal requirements that cannot meet the 
needs of blockchain development, and it suggested that exempted entities should fol- 
low the key principles of financial regulation. Secondly, some countries have further 
improved the Sandbox Mechanisms. Thirdly, many countries vigorously support the 
development of RegTech. To cite a few examples, Central Bank of Brazil (CBB) 
announced in April that Pier, an information integration platform for financial regu- 
lators based on blockchain technology, began its operation online, which could help 
the participating institutions quickly access the latest data of other institutions, thus 
shortening the data query operation that might have taken a month to several seconds. 

The fintech policy system has been continuously improved. Nowadays, countries 
all over the world gradually realize the potential value of fintech and formulate 
relevant development strategies and improve relevant policy systems to support the 
development of fintech. At present, apart from the policies related to AI, blockchain, 
big data, and other key underlying technologies of fintech, areas such as digital 
banking, online payment, and encrypted assets are gradually covered. The regulation 
on the application of fintech has basically realized full coverage, and the fintech policy 
system is continuously improved. 


1.4 The Models of Fintech Development 


At present, around the world there are generally five models of fintech development. 
The first is the Technology Promotion Model represented by the USA, which is char- 
acterized by mutual promotion of finance and technology and a win-win relationship 
between industry and culture. The second is the Rule Driven Model represented 
by the UK, which is characterized by innovating regulatory methods and boosting 
industrial development through rules. The third is the Market Pull Model represented 
by China, which is characterized by accelerated digital transformation and break- 
throughs sought in strict regulation. The fourth is the Mixed Competition Model 
represented by Japan and Indonesia, which is characterized by accelerating the pace 
of reform and continuous stimulation of potential. The fifth is the Model of Fanning 
out from Point to Area represented by South Korea and Israel, which is characterized 
by locating breakthroughs and focusing on tackling key problems (Fig. 2). 


1.5 Spatial Layout 


In recent years, the fintech hubs represented by Shanghai, Beijing, Shenzhen, 
Hangzhou, San Francisco (Silicon Valley), New York, London, and Chicago are 
accelerating their rise based on financial industry and driven by technology. China 
and the USA have their distinctive advantages in the development of fintech and have 
become leaders in the development of fintech worldwide. The Asia-Pacific region 
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Fig. 2 The models of fintech development around the world 


has gradually demonstrated its potential for fintech development and has attracted a 
large influx of capital, showing its advantage of backwardness. 

The transformation of traditional financial hubs has been accelerated and there 
is great potential for the development of fintech in the Asia-Pacific region. With the 
comprehensive empowerment and transformation of finance by technology, the trans- 
formation of traditional financial hubs having been accelerated and newly emerging 
financial cities having been upgraded in an all-round way, and a new ecology of 
regional economy having been created with a strategic height, in the future finan- 
cial hubs will take fintech as the core competitiveness of cities and compete for 
the commanding heights of fintech without exception. According to Global Fintech 
Hub Report 2021, the 9 cities in the first echelon of the global fintech hubs were 
Beijing, San Francisco (Silicon Valley), New York, Shanghai, Shenzhen, London, 
Hangzhou, Singapore, and Chicago respectively. These cities are home to the large 
financial institutions and the headquarters of financial institutions of the country. 
Most of them have a solid foundation for financial industry. They are currently start- 
ing the pace of all-round digital transformation of financial industry supported by 
technology. From the perspective of fintech experience, developing countries and 
Asia continue to maintain an overall leading edge. Not only the top 10 cities all 
located in developing countries in Asia, but also developing countries account for 
80% among the top 20 cities for two consecutive years and Asian cities account for 
65%. 
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2 Practice of Fintech Development in Asia 


2.1 China—The Fintech Has Been Promoted to the Worlds 
Leading Level 


2.1.1 Development Features: Accelerated Digital Transformation 


According to the development stages of technology application in financial industry, 
the development nodes of Chinas fintech industry are relatively clear. The develop- 
ment of fintech in China can be divided into four stages, as is shown in Fig. 3. China 
has entered the fintech 4.0 era, when finance and technology develop in a highly 
integrated way. 

The development of fintech industry leap into the front ranks of the world. There 
are 139 unicorns in China's fintech industry, ranking first in the world. The market 
scale of China's fintech enterprises is growing steadily. According to the prediction 
and display of relevant data of the Forward Looking Industry Research Institute, the 
market scale of China's fintech enterprises is expected to reach 463.1 billion yuan in 
2021, an increase of nearly 17% over the previous year. It is expected that the scale 
of China's fintech market will still achieve stable growth in 2022. 

Great progress has been made in technological innovation. From 2015 to the first 
half of 2019, a total of more than 22,000 enterprises applied for fintech-related patents 
in China, with a total number of more than 88,000 patents. Among them, big data 
analysis, interconnection technology, and cloud computing accounted for the highest 
proportion, while big data, cloud computing, biometric security, and AI maintained 
relatively smooth and steady growth; blockchain technology performed brilliantly 
with explosive growth, with the proportion of patents increasing from 0.4% in 2015 
to 8.5% in 2019 (Fig. 4). 


Fintech 
1.0 stage 
€ Innovate with subversive 
€ The in-depth integration of technologies such as big 
© On the basis of using modern Internet technology and data, cloud computing 
€ With the rapid development communication network financial business bas artificial intelligence and 
of communication technology, financial penetrated into the system blockchain, decompose the 
technology and information institutions pay more integration of financial traditional banking, 
technology, finance breaks attention to the application business, business process securities and insurance 
national boundaries, and the of database technology. The reengineering, financial system business, provide efficient, 
cross-investment of financial banking industry tries to interconnection and high value-added and 
institutions is also greatly centralize the banking data information sharing. and the Convenient goods and 
accelerated. In this era, the gradually and improve the construction of information services, and greatly reduce 
providers of financial service level on the basis of security system and risk transaction costs, improve 
services are mainly banks modern communication prevention and control system, the operational efficiency of 
network and database which gives birth to a large the financial industry 
technology number of business models, 
new carriers and businesses. 


Fig. 3 Development process of Chinas fintech 
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Fig. 4 Fintech patents 


There is a shortage of fintech talents. At present, fintech talents are in short supply, 
and the growth rate is far lower than the development rate of fintech itself. According 
to 2018 China Fintech Employment Report released by Michael Page (China), 92% of 
the fintech enterprises interviewed found that China is currently confronting a severe 
shortage of fintech professional talents, 85% of the employers interviewed said that 
they encountered recruitment difficulties, and 45% of the employers interviewed 
said that the greatest difficulty they confronted in recruitment was the difficulty in 
finding talents that could meet the specific position requirements. According to the 
survey, the most popular fintech positions were big data position, AI position, and 
risk management position, accounting for 40%, 32%, and 12%, respectively. 


2.1.2 Policies and Regulatory Measures: Finding Breakthroughs in 
Strict Regulation and Ensuring Steady Development of Data 
Protection 


The top-down design for fintech development has been continuously improved. In 
August 2019, the people’s Bank of China issued the Financial Technology (fintech) 
Development Plan (2019-2021). The introduction of this programmatic document 
will build the top-level design of “four beams and eight columns” of financial tech- 
nology. In December 2021, the central bank issued the Fintech Development Plan 
(2022-2025), which is the second round of fintech development plan issued by the 
central bank after the release of the plan in 2019. Compared with the first round of 
planning, this round of planning will focus on solving the problem of uneven and 
insufficient development of financial science and technology, with clearer key tasks, 
clearer development direction, and stronger implementation guarantee. At the same 
time, the plan puts forward the financial technology development vision of "striving 
to achieve the leap forward improvement of the overall level and core competitiveness 
by 2025", which has opened up a broader development space for China's financial 
technology industry. 
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The system of fintech supervision rules has been gradually improved. The basic 
regulatory rules system of fintech is gradually improving. While improving the rule 
system in a single technical field, it enriches the supervision of business links such 
as fintech innovative product design, operation mode, and risk control means. In 
addition, it further complements and improves the regulatory rules for consumer 
rights and interests protection, personal privacy, and financial information data. 

The standardization of fintech has been gradually strengthened. The central bank 
has issued and implemented technical standards for payment tokenization, payment 
information protection, acceptance terminal registration management, mobile ter- 
minal trusted execution environment, mobile financial client application software, 
incorporated financial science and technology products into the national unified cer- 
tification system, and continued to carry out leader activities in the field of point of 
sale terminals (POS), self-service terminals (ATM), bar code payment acceptance 
terminals and online banking services. 


2.1.5 Layout of Key Fintech Cities: The Cities in East China are 
Leading, but Each of the Cities has Its Own Characteristics 


At present, China is already leading the global fintech. However, there are differences 
in the development speed and level of fintech among its cities. The overall strength 
of the cities in east China is relatively strong, the optimized layout of Beijings fintech 
develops steadily, Shanghai tries to build an international brand of fintech, Shenzhen 
strives to be the leading role in the development of Guangdong-Hong Kong-Macao 
Greater Bay Areas fintech, and Hangzhou adopts the strategy of policy plus talents to 
re-create new vitality for the citys development. Cities such as Chengdu, Chongging, 
Guangzhou, Nanjing, and Qingdao are also actively laying out the development of 
fintech. 


2.2 Japan—Boosting the Rapid Growth of Fintech Through 
Advantages of Backwardness 


2.2.1 Development Features: The Advantage of Backwardness in 
Fintech has Shown 


The comprehensive competitive strength lays the foundation for the development 
of fintech. Japan is the third largest developed country in the world, But its fintech 
development began relatively late. In 2018, the scale of Japans fintech market reached 
214.5 billion yen, and it has been going up all the way. It was expected to reach 572.7 
billion yen in 2020, with an average annual growth rate of more than 50% (Fig. 5). 
Optimizing cultural soft environment and accelerating the shaping of a non-cash 
society. According to EY Global Fintech Adoption Index 2019, in terms of the 
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Fig. 5 The scale of Japans fintech market (Unit: 100 million yen, %) 


global consumer fintech application index, Japan ranked the lowest in 27 markets, 
with only 34%. The Japanese government issued Fintech Vision in May 2017, which 
clearly proposed that it should pay attention to the added value of fintech and focus 
on improving the adoption rate of electronic payments. After that, the government 
issued Future Investment Strategy 2017, explicitly proposing to triple the propor- 
tion of non-cash payments to more than 40% by Expo Osaka 2025. Since then, the 
Japanese government has been committed to promoting non-cash payment rebate 
activities [Consumers would get a rebate of about 2-5% for each non-cash payment], 
continuously optimizing the cultural environment and accelerating the shaping of a 
non-cash society. 

The commercial configuration of various industries has gradually taken shape. 
The mobile payment sector has stepped out of the era of barbaric growth and formed 
a duopoly pattern of Line pay and PayPay, and has nurtured a number of outstanding 
fintech start-ups on this basis. Japan attaches great importance to the development 
of blockchain. In 2018, the market size of Japan blockchain reached 8.07 billion 
yen, and it reached 33.57 billion yen by 2020. With relatively strong development 
momentum, it saw the emergence of a number of blockchain start-ups with certain 
strength and characteristics, such as Dobulejump.tokyo and Nayuta Japans regula- 
tion on network lending is relatively loose, and network lending and crowdfunding 
have become an important part of Japans inclusive finance, hence the much rapid 
development of the industry. In 2014, when Japan amended its financial commodity 
trading law, crowdfunding suddenly came to the fore. In 2018, the scale of Japans 
crowdfunding market reached 204.5 billion yen. During the epidemic, many crowd- 
funding platforms also took on the responsibilities of assisting commercial tenants, 
etc. Japan is also actively promoting the development of sectors such as personal 
loan, Robo advising, and supply chain finance. Although they are still in the initial 
stage, they have great potential for future development (Fig. 6). 
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Fig. 6 A diagram of the industrial ecology of Japans fintech 


2.2.2 Policy and Regulatory Measures: Optimizing the Policy System 
and Forging Ahead with Determination 


In terms of the development policy and regulatory measures in fintech, Japan adopts 
strict regulation and easing measures at the same time. For the development of some 
traditional industries, especially in the aspects of digital transformation, the regu- 
lation is relatively strict. However, the regulation on sectors such as crowdfunding 
and network lending is relatively loose, so these sectors can develop rapidly. Strict 
regulation measures can effectively control the risks in fintech innovation. Moreover, 
in 2018, JFSA started to implement a sandbox mechanism for financial innovation, 
allowing financial and insurance products to be put into trial operation within a certain 
risk range, and steadily promoting healthy and sustainable innovative development. 
The loose measures in some sectors can stimulate the development vitality of the 
fintech industry for it to reform in development, maintain stability in progress, and 
create a safe and controllable development ecology in an all-round way. 


2.2.3 Layout of Key Fintech Cities: Tokyo Bay Area Endowed with 
Good Resources to Push Traditional Financial Institutions on the 
Way of Reform 


Fintech got developed in Japan later than in other developed economies, so it has 
not yet formed a ubiquitous layout of fintech hubs. Whether according to the fintech 
hub report released by Global Fintech Hub Federation or the index and list of fintech 
hubs released by institutions such as Deloitte and Z/Yen Group, the fintech hubs of 
Japan that may enter the list tend to be Tokyo. Therefore, this chapter focused on 
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the relevant situation and policy measures of Tokyo as a fintech hub. Tokyo ranks 
among internationally renowned financial centers together with other international 
financial centers such as New York financial center and London financial center. 
Meanwhile, Tokyo is also the capital of Japan and the financial capital of Japan. Since 
the 1960s, the Japanese government has been planning to build the capital circle 
of Tokyo, linking Tokyo with several neighboring counties for joint development 
and construction. At present, Tokyo Bay Area has become one of the worlds eight 
recognized bay areas. 

Since the 1990s, the Japanese government has formulated and promulgated a 
series of science and technology innovation strategies and policy measures to stim- 
ulate the high-speed rise of science and technology innovation level in Tokyo Bay 
Area. Relying on internationally first-class universities and research institutions, 
innovative enterprise clusters, and the support of the Japanese governments policy 
inclination, Tokyo Bay Area has absorbed advanced technology and innovation con- 
cepts in its opening to the outside world, vigorously developed advanced scientific 
and technological productivity, formed a bay area ecological environment conducive 
to scientific and technological innovation, spawned numerous scientific and tech- 
nological innovation institutions, and witnessed the emergence of a large number 
of scientific and technological innovation achievements, making Tokyo Bay Area 
gradually develop into a world center of innovation with international influence. 


2.3 Singapore—Gathering Innovative Resources with a 
Relaxed and Inclusive Atmosphere 


2.3.1 Development Features: An Active Atmosphere for Fintech 
Innovation 


International innovation elements gather and multiple resources converge. Singa- 
pore is an international trade hub, an Asian financial center, and a place of strategic 
importance for technological innovation. Its convenient geographical conditions have 
facilitated the convergence of financial and technological innovation resources. On 
the one hand, as a global financial center, Singapore has financial industry as its ser- 
vice industry with the highest added value, with more than 1,200 financial institutions 
stationed here. On the other hand, Singapores scientific and technological innovation 
has developed rapidly. In Global Innovation Index 2018 released by WIPO (WIPO), 
Singapore ranked the fifth, overtaking traditional science and technology powers 
such as the USA, Germany, Israel, South Korea, and Japan and successfully ranking 
among the worlds leading science and technology innovation centers. 

There are rich forms of activities and strong vitality in fintech. Since 2016, Sin- 
gapore has been hosting Singapore Fintech Festival (SFF) and Singapore Week of 
Innovation & Technology (SWITCH). In 2019, SWITCH and SFF merged into SFF 
X SWITCH for the first time. On June 8, 2020, on the basis of previous experience 
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of holding activities, Singapore held the MAS Global Fintech Innovation Challenge 
for the first time by innovating the form of activities. With the theme of Building 
Defenses, Seizing Opportunities, and Emerging Stronger, the competition had a total 
bonus of $$1.75 million and comprised two parts: MAS FinTech Awards and MAS 
Global FinTech Hackcelerator. 

Digital banking booms and digital finance accelerates. At present, Singapore is 
gradually loosening the restriction on the application for digital full bank license. The 
introduction of digital bank license is the largest banking liberalization in Singapore 
in the past 20 years. In December 2020, MAS issued a total of 4 digital bank licenses, 
of which 2 were DFB licenses and another 2 were DWB licenses. The launch of digital 
banks in Singapore will form competition with traditional banks, but meanwhile it 
will promote the rapid development of fintech in Singapore. 

Actively extending the application scenarios of blockchain technology. Singapore 
is the friendliest country to the development of blockchain in Southeast Asia and 
even all over the world. At present, a large number of mature blockchain projects are 
distributed in sectors such as trading platforms, public blockchains, hosting, cloud 
storage, infrastructure, consulting, and insurance. Singapore vigorously promotes 
the application of blockchain technology in financial scenarios. On the one hand, it 
uses blockchain technology to promote the development of digital payment. On the 
other hand, it focuses on SME financing and supply chain finance. In addition, it 
adopts blockchain technology to ameliorate the pain points of service of industrial 
finance including supply chain finance, etc. 

In sound fintech ecology, various subjects jointly pursue interconnected develop- 
ment. Singapores rich and diverse international fintech activities and its open and 
inclusive innovation environment, etc. have attracted diversified fintech talents to 
gather here. In August 2020, Singapore established Asian Institute of Digital Finance 
(AIDP), jointly founded by MAS, National Research Foundation (NRF) of Singa- 
pore, and National University of Singapore (NUS), to meet the demand for digital 
financial services in Asia. The strong community effect has attracted the conver- 
gence of talents such as entrepreneurs, domain experts, angel investors, and industry 
mentors, and it has provided a platform for exchanges and collaboration among 
entrepreneurs, investors, and financial institutions. Meanwhile, it has attracted high- 
quality native start-ups of Asian countries such as India and Indonesia to migrate to 
Singapore, forming a highly open international fintech ecosystem (Fig. 7). 


2.3.2 Policy and Regulatory Measures: The Top-Down Design is 
Optimized and Special Policies are Increased 


Perfecting the top-down design of fintech. The Singaporean government has autho- 
rized MAS to be the policy subject for the innovation and development of fintech 
which is fully responsible for the strategic planning, policy framework, and policy 
coordination of the development of fintech. In order to further promote the coordi- 
nated and efficient development of fintech, MAS has established professional fintech 
management institutions. Firstly, FinTech & Innovation Group (FTIG) was set up, 
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Fig. 7 Fintech ecosystem 


which comprises three offices, respectively for payment and technology solutions, 
technology infrastructure, and technology innovation lab. FTIG invested S$ 225 
million to promote Financial Sector Technology & Innovation Scheme (FSTI) and 
encourage the global financial industry to set up an innovation and research and 
development center in Singapore. Secondly, Fintech Office was established, which 
is mainly responsible for three tasks: the first one is to review, correspond to and 
improve fintech-related subsidy schemes for cross-governmental agencies; the sec- 
ond one is to pay attention to industrial infrastructure and the gap between talent 
training and manpower demand and put forward target strategies, policies, and pro- 
grams to enhance the competitiveness of the industry and enterprise organizations; 
the third one is to manage Singapores fintech brand and marketing strategy through 
fintech activities and related initiatives and strive to become a global fintech hub. 

Formulating special fintech regulatory policies to promote the healthy devel- 
opment of enterprises. The Singaporean government has adopted a multi-pronged 
approach to promote the development of fintech at home. Some of the measures 
are universal, including providing a supportive environment for start-ups, adopting 
a collaborative approach, and attracting foreign investment. Apart from common 
measures, Singapore has formulated special fintech regulatory policies to guide the 
development of various segments of fintech, mainly focusing on AI and data analy- 
sis (hereinafter referred to as AIDA), blockchain technology, digital assets, payment 
bill, and open banking. 
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Promoting the development of RegTech application to reduce risks SGX has 
launched a new RegTech scheme that can automatically report market irregularities 
and promote fair trading. At present, Singapore has had representative companies 
ranking the worlds top 100 in RegTech involving sectors such as compliance manage- 
ment and anti-money laundering. Besides, for local start-ups, the Singaporean gov- 
ernment has established a Regulatory Sandbox system to encourage the innovative 
development of fintech start-ups and nurture and incubate outstanding enterprises. 


2.4 South Korea—Promoting Scale Development of Fintech 
Industry by Fanning Out From Point to Area 


2.4.1 Development Features: The Foundation for the Development of 
Fintech is Solid 


The 5G information and communication technology takes the lead. According to 
the data from South Koreas Ministry of Science & Information and Communication 
Technology (MSICT), South Korea is the first country in the world to start 5G 
commercial use. After the official launch of 5G business services on April 3, 2019, 
the number of users has increased continuously, reaching nearly 10 million by the 
end of October 2020. 

Blockchain technology is developing rapidly. According to the data from Korean 
Intellectual Property Office (KIPO), a total of 1,301 blockchain patents were regis- 
tered in South Korea in 2019, a 50-fold increase from 24 in 2015, and the number of 
the patents increased further in 2020 after the Covid-19 pandemic began. 

Forming the mechanism design of two kinds of institutions and three modes of 
sharing, the big data credit system is refined. South Korea has established a much 
refined personal and corporate big data credit system. In this system, Korea Fed- 
eration of Banks (KFB) is the pillar of the credit industry. On this basis, there are 
three data-sharing modes of credit information service. One is to force financial 
institutions to submit credit information to KFB, which will then be provided by 
KFB to private credit reporting companies; the second is to share information within 
the industry through associations or corporate groups; the third is that credit report- 
ing companies collect other information through commercial contracts. Under the 
mechanism design of two kinds of institutions and three modes of sharing, the South 
Korean credit investigation industry can not only quickly and timely collect nation- 
wide credit information, but also ensure that valuable credit information could be 
legally and fully shared across the whole society. 

P2P loan industry develops rapidly. According to the statistics of the South Korean 
government, the total investment in P2P loans in South Korea increased from 37.3 
billion won at the end of 2015 to 2.34 trillion won at the end of 2017, and then rapidly 
increased to 6.2 trillion won by the end of June 2019. In August 2020, the Law on 
Financial Industry Related to Online Investment and laws related to user protection 
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(P2P laws) were officially implemented, which would strengthen the protection of 
investors and formally set a legal framework for P2P development. 


2.4.2 Policy and Regulatory Measures: Strengthen Planning and 
Launch Fintech Development Strategy 


On December 4, 2019, Financial Services Commission (FSC), Republic of Korea 
announced that it would vigorously promote the large-scale development of the fin- 
tech industry, and introduced 8 measures in different sectors, involving improving 
the current Regulatory Sandbox system, carrying out regulatory reform to promote 
the development of fintech, loosen the entry restrictions of the financial industry, 
establishing a regulatory basis for the digital age, developing new growth engine 
for financial innovation, promoting the investment in fintech and establishing a ven- 
ture capital ecosystem with private sector investment as the core, assisting fintech 
enterprises with overseas expansion expanding public support for fintech enterprises. 

Preferential taxation is applied to research and development of blockchain tech- 
nology. A report released by the local Ministry of Strategy and Finance announced the 
latest tax laws that came into effect in February 2019. And the blockchain has been 
added to the research and development list that provides tax credits. This means that 
the companies or enterprises that develop blockchain technology can deduct some 
taxes from the research and development expenses. The tax reduction depends on 
the size of the company. 

Implement the Regulatory Sandbox plan and accelerate the digital transformation. 
On April 1, 2019, FSC, Republic of Korea officially launched a fintech Regulatory 
Sandbox program, thereby hoping to promote competition in South Koreas financial 
industry and bring more favorable services to consumers. Up to May 2020, FSC had 
held a total of 14 assessment committee meetings. Through various assessments of 
business innovation, consumer convenience, and project stability and feasibility, a 
total of 102 innovative financial services were eventually selected into the Regulatory 
Sandbox, which obtained exemption from licensing and other regulations. 


2.4.3 Layout of Key Fintech Cities: Seoul—Taking Various Measures 
to Create a Business Environment for Fintech 


Seoul is the largest city on the Korean Peninsula and one of the major financial cities 
in Asia with advantages in economic, technological, and cultural development. Seoul 
ranks the fifth among the top ten Asian cities in terms of economy, only after Tokyo, 
Shanghai, Beijing, and Hong Kong. Its economic aggregate accounts for about 2396 
of that of South Korea. The population of Seoul is about 10.2 million, accounting for 
20% of the total population of South Korea. In addition, Seoul accounts for half of 
Korea in terms of personal income tax, corporate income tax, and bank deposits, and 
the number of innovative enterprises and graduates from colleges and universities 
account for 3096 of Koreas total. Seoul is South Koreas political and economic 
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center, which has laid a good foundation for fintech development. In recent years, 
Seoul has created a good business environment for fintech by holding the fintech 
week, establishing fintech labs, setting up innovation funds to increase investment 
in fintech industries, etc. 

Holding fintech week to create an innovative atmosphere. South Korea launched 
the first Korea Fintech Week from May 23 to 25, 2019. The event was held in 
Seouls Dongdaemun Design Plaza (DDP). It is the first global fintech fair in South 
Korea, and the FSC hopes to develop it into an important annual fintech event in 
Asia. In 2019, Korea Fintech Week invited global financial institutions, international 
organizations, and global fintech companies to discuss relevant policies to help local 
fintech companies expand ties with local and global investors. In addition, the activity 
also provided counseling services for college students and young job seekers who 
are interested in the fintech industry. 

Affected by the epidemic, the 2020 Korea Fintech Week was changed to be held 
online. The FSC said it had attracted more than 170,000 page visitors and received 
more than 110 million page views. Financial companies and fintech enterprises set 
up a total of 150 virtual exhibition halls. A total of 35 enterprises participated in the 
online job fair session, with more than 1,000 job seekers competing for about 80 jobs 
provided by 21 fintech enterprises. 

Launching SEOUL FINTECH LAB. In April 2018, the Seoul municipal govern- 
ment launched the first SEOUL FINTECH LAB, which was mainly aimed at early- 
stage start-ups. In July 2019, the second fintech lab was launched, which would be 
targeted at growth-stage start-ups and accommodate approximately 14 start-ups from 
South Korea, the USA, Hong Kong, and Singapore. The fintech lab will become a 
key anchor point for South Koreas fintech industry, so as to help South Koreas 
promising fintech start-ups develop abroad. The Seoul Innovation Growth Fund was 
established. At the end of 2018, Seoul set up an innovation fund of around 130 bil- 
lion won (approximately USD 116 million) to be invested exclusively in innovative 
industries such as blockchain and fintech. In early 2019, the Seoul municipal govern- 
ment announced that the Seoul municipal government would invest 1.2 trillion won 
(USD 1.07 billion) in start-up companies in the fintech sector through the investment 
fund by 2022. 


2.5 KazakhstanCDigital Transformation Speeds Up the 
Construction of Central Asian Fintech Hub 


2.5.1 Development Features: The Digital Economy is Booming 


Central Asia is located between the worlds two largest economies. Proposed by the 
Belt and Road Initiative, it can be regarded as a bridge connecting Europe and China. 
As a growing potential market, Kazakhstan has played a key role in the process of 
Central Asia becoming a global fintech hub. 
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Although in the beginning the level was relatively low, the digitization process 
in Kazakhstan is developing rapidly, mainly including: (1) The rapid growth of 
e-commerce and mobile commerce; (2) The transition from cash payments to non- 
contact and digital payments; (3) The growth of innovative digital financial products 
and services. Since the outbreak of COVID-19, these structural changes which had 
lasted for many years have accelerated, creating a favorable environment for the 
further development of fintech. 

The fast-growing e-commerce market is one of the driving forces behind the 
development of fintech. Compared with other emerging market countries and devel- 
oped economies, in Kazakhstan, the e-commerce penetration has a significant upward 
potential. According to Euromonitors data, the market value of e-commerce in Kaza- 
khstan in 2019 was estimated to be KZT 401.3 billion (USD 1.1 billion), equivalent 
to 3.4% of the total volume of retail trade, with a CAGR of 33.3% from 2016 to 
2019. 

Digital payment is developing rapidly. The adoption rates of Internet and mobile 
phone have increased significantly. According to Ovum (world mobile information 
service), Kazakhstans total number of smartphones increased from 12.7 million in 
2016 to 19.2 million in 2019 and is expected to reach 23.4 million by 2024. Banks 
are using mobile and Internet banks to provide better financial services to remote and 
rural areas. Fintech companies have fewer opportunities to cooperate with standard 
financial sector, but with the increase of mobile Internet adoption rate, they have 
obtained huge opportunities in areas not covered by traditional financial markets. 
In 2019, Kazakhstans digital payment amount more than tripled to about USD 34.8 
billion Like e-commerce, this trend has been accelerated by the COVID-19 epidemic. 
In 2019, Kaspi.kz accounted for 8396 of the growth of the entire payment market in 
Kazakhstan and became the largest contributor to Kazakhstans transition to digital 
payment. 

The digital transformation of banking and insurance industry is at the right time. 
The COVID-19 epidemic has enabled most retail banking activities to be carried out 
online, which has promoted the development of digital banking. Banking services are 
rapidly moving from branch-based, product-centric organizations that use traditional 
technologies to more personalized digital solutions that are consumer-centric and 
deliver seamlessly. Since January 2019, the citizens of Kazakhstan have been able 
to use electronic insurance and choose to submit their applications online. Within 
the conceptual framework of the development of the financial sector in the Republic 
of Kazakhstan to 2030, it is expected that electronic insurance policy sales will be 
introduced into the compulsory courses. 


2.5.2 Policy and Regulatory Measures: Being Committed to Promoting 
Financial Innovation in a Wider Range of Areas 


Astana Financial Services Authority (AFSA), established on January 1, 2018, is 
independent from the National Bank of Kazakhstan (NBK) and the financial market 
supervision and Development Department of Kazakhstan. It is an independent regu- 
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lator of financial and non-financial services activities of AIFC (Astana international 
financial center, established on July 5, 2018, is the financial center of Astana, Kaza- 
khstan). In its fintech hub department, AIFC assists relevant companies in developing 
new products and services in the fintech sector in various ways. One way is to provide 
acceleration projects where start-ups can work closely with mentors from around the 
world to develop the necessary capabilities. Another way is for the fintech depart- 
ment of AFSA to provide a legal and regulatory basis for the development of new 
financial products and technologies and to test them at the fintech lab (Regulatory 
Sandbox). At present, 30 projects are being tested in Regulatory Sandbox. Currently, 
more than 125 start-ups work with the fintech department of AIFC. These compa- 
nies are distributed in different sectors such as payment and mobile wallet, market, 
credit, AI and machine learning, blockchain, digital identification, network security, 
and fraud prevention. 

The fintech department of AIFC supports venture capital and corporate inno- 
vative development with the goal of creating a healthy venture capital ecosystem 
and expanding opportunities for start-ups in Central Asia and the countries belong- 
ing to Commonwealth of Independent States (CIS) to attract investment and perform 
transactions. In this way, AIFC is creating a comprehensive ecosystem, which covers 
strengthening regulation, supporting start-ups, helping attract investment, and imple- 
menting fintech solutions within enterprises. To address the innovation challenge in 
the financial sector, AIFC is taking a series of regulatory measures to promote innova- 
tion and strengthen the protection of the consumers of financial services/products., 
including setting up a fintech lab to promote fintech development, introducing a 
regulatory framework to promote the development of crowdfunding, expanding the 
framework for the list of regulated and market activities, implementing a series of 
policies to promote the healthy development of digital asset, creating a looser bank- 
ing system to strengthen inclusive financing, implementing open API to promote the 
innovation of digital currency and payment services, providing corporate income tax 
and value-added tax exemptions for fintech companies optimizing e-commerce reg- 
ulatory measures, ameliorating the framework to promote venture capital financing, 
launching Global Financial Innovation Network (GFIN) to promote cross-border 
regulation and innovation. 


2.5.3 Layout of Key Fintech Cities: Nur Sultan and Almaty—Leading 
the Development of Non-cash Payment 


Kazakhstans cities with the most active fintech development are undoubtedly Nur 
Sultan (the capital) and Almaty (the countrys largest city). As the most densely 
populated and economically developed cities, they have non-cash payments leading 
in both quantity and share. Almatys non-cash payments occupied the largest market 
share: nearly KZT 7 trillion (about USD 16.5 billion). The city also had the highest 
proportion of non-cash payments, which was 76.896. Nur Sultan ranked the second 
with a market share of KZT 2.9 trillion (approximately USD 6.8 billion) in non-cash 
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Fig. 8 Market share and proportion of non-cash payment in different cities and regions of Kaza- 
khstan 


payments, with the proportion of non-cash payments reaching 72.296, which also 
ranked the second (Fig. 8). 

Apart from being absolute leaders in the market share of non-cash payments, 
Almaty and Nur Sultan are also major hubs for fintech start-ups in the country. 
The countrys largest fintech accelerator, science park, and center are located in the 
following cities: the AIFC Fintech Hub in Astana, AIFC and Nuris; TechGarden and 
Most in Almaty. 


2.6 India—Potential for Fintech Development Has Been 
Gradually Exerted 


2.6.1 Development Features: Digital Technology Promotes the 
Innovative Development of Fintech 


Indian fintech enterprises as a whole are experiencing the transition from initial stage 
to growth stage. According to relevant statistics, as of 2019, the number of fintech 
start-ups in India was second only to the USA, ranking the second in the world. Taking 
the development of fintech enterprises in several major sub-sectors as an example, 
there are only dozens of network lending platforms in India at present, which are in 
the initial stage, and few Indians have experienced online lending investment. Indias 
credit investigation industry is still in the exploratory stage, with a huge long tail 
market. Indias crowdfunding industry is in the early stage of development, showing 
a slow growth trend since 2014. The crowdfunding industry lacks clear regulation, 


On the Development of Fintech in Asia 21 


and SEBI(Securities and Exchange Commission of India), the main regulator, has 
not yet issued regulatory regulations. 

The commercial forms of fintech are constantly improving. Sectors such as 
payment, loan, wealth management technology, personal finance, insurance, and 
RegTech have blossomed in an all-round way. Take the sectors of payment and online 
lending as an example. In the sector of payment, Internet payment covers enterprises 
with various systems, such as telecom, e-commerce, banks, wallet companies, and 
other enterprises with different representatives, and some representative payment 
companies such as Paytm are popular with capital investment. Indian fintech also 
has great potential in payment, online lending, blockchain, robo advising, inclusive 
financing, technology-driven integrated banking services, Internet financial security, 
biometrics, etc. 

Digital payment helps India seize the innovation highland of fintech. Digital pay- 
ment industry has become the core field of accelerating digital capacity building in 
India and has greatly boosted India to seize the innovative highland of fintech. In 
2016, Modi put forward the slogan of Stand Up India, which officially helped the 
entrepreneurial trend from the height of national policies, with a view to establish- 
ing a new ecosystem in the financial scope, and announced the implementation of 
the Digital India program, initiating the banknote scrapping campaign, and mak- 
ing clear that digital ID cards should be bound to financial services. The banknote 
scrapping campaign directly boosted Indias fintech industry to the mainstream posi- 
tion, and Indias unique payment infrastructure with a unified payment interface won 
the trust of the people, which solved the problem of cash-based mode of payment 
and reduced the financing difficulties of the enterprises. In 2019, India launched the 
Digital India program, hoping to digitize every offline transaction by unifying the 
payment industry and e-commerce system. Meanwhile, Mastercard has also launched 
a project called Team Cashless India in India. This activity would help merchants 
accept digital payment and improve the coverage of fintech. In addition, the huge 
development potential of Indias fintech market has also attracted more companies 
around the world to deploy the Indian fintech market, such as famous Chinese enter- 
prises Alibaba, Tencent, and JD.com and investment institutions Sequoia Capital, 
Hillhouse Capital, etc. The international influence of the fintech development has 
been continuously enhanced. 

Population advantage lays the foundation for fintech development. India has the 
second largest population in the world. According to the statistics of the World Bank, 
as of 2018, the population of India was 1.353 billion. In terms of population structure, 
the population under 35 years old accounts for 65% of the total, and the population 
under 25 years old accounts for 5096. In terms of age and proportion, India has a 
larger number of young people, which is an idealized proportion in the population 
structure. There are a large number of talents available for fintech development. 
In the entrepreneurial development of fintech, more young entrepreneurs have the 
opportunity to start businesses, and there are more long-tailed users of fintech. By 
2019, the number of Internet users in India was 560 million, accounting for about 
41% of the total population. As a country with big Internet demand only second to 
China, India has great room for the development and implementation of intelligent 
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applications. In terms of the adoption rate of fintech, according to public data, as of 
2019, the adoption rate of fintech in India had reached 87%. 


2.6.2 Fintech Development Policies and Measures: Two-Pronged 
Approach of Regulation and Publicity to Accelerate Fintech 
Development 


Supervision is the key to the development of fintech. India insists on doing by learn- 
ing, learning by doing in fintech regulation. In the aspect of fintech regulation, the 
top-down design is gradually improved, fintech is included in the regulation scope, 
the Regulatory Sandbox of fintech has been launched, and the popularization rate and 
adoption rate of fintech are improved by setting up funds, launching fintech publicity 
activities, etc. 


2.6.3 Fintech Development Measures in Key Cities: Mumbai—Sound 
Financial Foundation and Satisfying Scene Experience 


In 2019, Mumbais overall ranking in Global Fintech Hub Index (GFHI) improved 
by 6 places and it entered the top 20 in the world for the first time. Mumbais fintech 
industry has improved rapidly, and with 6 highly financed unlisted fintech enterprises 
such as Freecharge and InCred the number ranked the 16th in the world. Moreover, 
Mumbai, with its huge population size and excellent population structure (the average 
age was only 27 years old), had 64% fintech users, ranking the 12th in the world. 
The advantage of ranking the first in Asia except Chinese cities was glamorous all 
over the world, making the fintech experience in Mumbai a major advantage. 

Actively building fintech into one of the characteristic industries for urban devel- 
opment. Mumbai, as the largest financial center in India, constantly ameliorates its 
fintech ecology. At present, it has large financial institutions such as HDFC Bank, 
Kotak Mahindra Bank, ICICI Bank, and State Bank of India, ranking the seventh 
in the global TOP200 financial institutions by total market value. The digital trans- 
formation of traditional finance is accelerated actively, e.g., State Bank of India has 
launched YONO, a comprehensive life and financial service platform, HDFC Bank 
launched UltraCash, a mobile payment application program, etc. The rate of utiliza- 
tion of fintech has been increased in an accelerated manner. 


On the Development of Fintech in Asia 23 


2.7 Israel—Guidance Plus Service to Create a Highland for 
the Development of Fintech 


2.7.1 Development Features: Technology and International Resources 
are Transformed Into Fintech Development Advantages 


Israel is an internationally recognized innovation powerhouse. The proportion of 
scientists and engineers engaged in high-tech research and development in Israel is 
the highest in the world. Among the high-tech companies listed on Nasdaq in the 
United States, the total number of Israeli companies ranks second. Israel has more 
than 6,000 technology start-up companies, ranking first in the world. More than 
270 multinational companies in the world have set up scientific research centers in 
Israel. Israel has strong scientific and technological innovation genes and interna- 
tional resources. These resources have laid a solid foundation for fintech innovation 
and development of Israel. In general, the development of Israeli fintech presents 
three major development features. 

First, the underlying technology shows obvious endowment advantages. Israel is 
a model of the integration of military and civilian development all over the world. At 
the same time, the demand for cutting-edge technology and related innovations in 
the military field are smoothly transmitted to the commercial field. Israel has strong 
military applications in the fields of security, computer vision, and neuro-language 
planning. These technological applications are also applied to the development of 
fintech. Currently, it is the country with the highest usage density of fintech applica- 
tions. Israel is one of the first countries in the world to adopt blockchain and digital 
encryption technology. It has many start-up companies with core technologies in the 
field of blockchain and digital encryption, such as QEDit and DAGLabs. From 2013 
to 2019, the amount of financing for fintech and related underlying technologies 
(artificial intelligence, network security, etc.) was on an upward trend. Especially in 
the field of artificial intelligence, the amount of financing doubled from USD 1.463 
billion in 2017 to USD 3.182 billion in 2019, and the average amount of a single 
financing increased from USD 7.07 million in 2017 to USD 16.41 million in 2019, 
increased by more than 100%. 

Second, the science and innovation ecology is increasingly perfect. Israel attaches 
great importance to the formation of the science and innovation ecology. Its gov- 
ernment has taken various measures to ensure that scientific research is one of its 
priorities. It provides security through tax and fee reduction, and at the same time it 
increases the expenditure in the industry. In the 2020 OECD R&D Intensity Index (the 
ratio of R&D investment to GDP), Israel continued to maintain its leading position. 
It is expected that the total future expenditure will continue to increase. According 
to the latest annual global entrepreneurial ecosystem rankings released by the global 
entrepreneurial research organization StartupBlink, Israels global ranking has risen 
by one over last year, ranking third in the world. 

Third, the degree of internationalization of fintech is high. Due to geographical 
restrictions and market restrictions, Israels fintech has attracted international invest- 
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ment and fintech companies and has exported products and services to the outside 
world since its emerging. According to a report released by the Israel Venture Capital 
Data Center in 2020, the participation of foreign investors in Israeli equity invest- 
ment in the fintech sector increased from 57% in 2018 to 69% in 2019. In the fields 
of payment, transaction, and digital currency, more than 90% of fintech companies 
provide international services. In 2019, Israels high-tech industry export continued 
to grow, reaching a historical record of USD 45.8 billion, accounting for about 46% 
of Israels total export, increased by 1.2% than that in 2018. Among them, the export 
of related technology products and services such as fintech and artificial intelligence 
accounted for a relatively large proportion. 


2.7.2 Policies and Regulatory Measures: Guidance but Not Leading, 
and Strengthening of Communication Between Government and 
Industries 


Israel has many policies and regulatory measures. From the establishment of a regu- 
latory and innovative fintech hub to the establishment of a fintech assistance center, 
from adjusting the fintech license application process to launching the data sandbox 
program, the Israeli government basically guides the development of fintech as a mar- 
ket assistant and industrial development guider. Strengthening the communication 
between the government and the industries and being a partner for the development 
of fintech are the characteristics of Israeli fintech policies and regulatory measures. 

In July 2018, the Israel Securities Authority (hereinafter referred to as ISA) 
announced the establishment of a regulatory innovation fintech hub, mainly aiming 
at promoting dialogue between regulators and participants in the fintech industry. In 
2019, the Capital Market Authority of Israel joined the GFIN and participated in the 
global fintech regulatory reform together with international institutions such as the 
World Bank and the International Monetary Fund, etc. In July 2020, the Israel Secu- 
rities Authority and the Israel Innovation Authority jointly launched a data sandbox 
plan for fintech start-ups. 


2.7.3 Layout of Key Fintech Cities: Tel Aviv—The Integration of 
Internal and External Strategies to Promote the Innovation and 
Development of Fintech 


Tel Aviv is Israels second largest city. The city cluster centered on Tel Aviv has 
become Israels largest metropolitan area and economic hub, and is known as Israels 
economic capital and technology center. 77% of Israeli start-ups, 81% of investment 
institutions, 7296 of incubators, and 8596 of R&D centers are located in Tel Aviv. 
Tel Aviv owns Israels only stock exchange, Tel Aviv Stock Exchange (TASE), which 
has become the international headquarters of venture capital companies, scientific 
research institutions, and a gathering place for high-tech companies. At the same 
time, Tel Aviv has a relatively complete innovation incubation system and scientific 
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talents. Among the top 200 global entrepreneurial ecosystem cities, Tel Aviv of 
Israel ranks seventh. The advantage of focusing on financial innovation with leading 
technology is obvious. According to the 27th Global Financial Center Index Report 
(GFCI 27), Tel Aviv ranks 36th. 

Tel Avivs fintech development adopts an international strategy. Taking advan- 
tage of its own superior innovation environment and developed international capital 
agglomeration, the products of major fintech companies consider Tel Aviv as an 
effective test point for product technology, and Tel Aviv will be the first place to 
test the effects of products and services before international marketing. Tel Aviv 
Global City Office is used to implement targeted marketing for international fintech 
customers. While enhancing the citys global media image, various activities are held 
to meet fintech services and needs, link start-ups and investment capital, as well as 
implement cross-bank and cross-domain cooperation. 


2.8 Indonesia—A Rising Star of Fintech Development in 
Southeast Asia 


2.8. Development Features: Fintech is in the Preliminary 
Development Stage, and Its Potential Continues to be Highlighted 


The development of the Internet has certain advantages. According to the 2019 
Southeast Asia Digital Economy Report, Indonesia is the country with the largest 
Internet economy in Southeast Asia. It was even more than quadrupled in 2019, with 
more than USD 40 billion, and it is expected to reach USD 130 billion in 2025. 
Internet users in Indonesia are growing rapidly. According to a report released by 
a global social media marketing company We Are Social and Hootsuite, in January 
2020, Internet penetration rate of Indonesia was 64%, with an average annual growth 
rate of close to 2096. Moreover, Indonesia has crossed the mature development stage 
of the Internet and moved directly to the development stage of the mobile Internet. In 
January 2019, there were 356 million mobile phone users in Indonesia, the penetration 
rate of mobile phones was 133%, and the number of active mobile Internet users 
reached 142 million. 

The fintech industry is developing rapidly. According to a market report by Swiss 
Global Enterprise, a Swiss export and promotion agency, Indonesias digital financial 
services revenue is expected to grow significantly at a compound annual growth rate 
(CAGR) of 34%, and will reach USD 8.6 billion in 2025. The research report Future 
of Southeast Asia Financial Technology shows that the total valuation of Indonesian 
fintech companies reached USD 35 billion in 2020, accounting for 32% of that in 
Southeast Asia. 

The online lending and payment industry is booming. The cumulative amount of 
loans for online lending in Indonesia increased from IDR 2.56 trillion in December 
2017 to IDR 102.52 trillion in March 2020, increased by 40 times. According to data 
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compiled by the Otoritas Jasa Keuangan (OJK), Indonesias total loans from Fintech 
loans in May 2020 increased by 166.0396 on a year-on-year basis. OJK estimates 
that there are more than 25 million borrower accounts and more than 654,200 entities 
providing loans. In terms of fintech payment, the total number of electronic money 
transactions at the end of 2019 reached 5.2 billion, an increase of 79.396 from 2.9 
billion in last year. In May 2020, BI had issued licenses to 51 electronic money 
operators, and the main participants included GoPay, Ovo, Dana, and LinkAja. 


2.8.2 Policies and Regulatory Measures: The System Continues to be 
Improved and the Supervision Continues to be Upgraded 


The fintech policy and supervision system have been continuously improved to 
encourage the development of the industry. Indonesias fintech sector is under the 
supervision of Bank Indonesia (BI) and the Otoritas Jasa Keuangan (OJK), with the 
Ministry of Information and Communications of Indonesia playing a supporting role. 
Bank Indonesia and OJK are responsible for different regulatory fields. Each of them 
has a supervisory team, and they learn from each other and complement each other 
(Table 1). 

In October 2017, the Otoritas Jasa Keuangan (OJK) issued the 2017-2022 Devel- 
opment Plan, which formulated 10 major policies and implementation plans, and 
clearly stated that appropriate supervision should be carried out to optimize the devel- 
opment of financial technology. On November 30, 2017, Bank Indonesia issued the 
Financial Technology Regulatory Regulation No. 19/12/PBI/2017 for the first time, 
which aimed to regulate fintech behaviors to promote innovation, protect consumers 
and manage risks so as to maintain a stable currency and financial system and build 
an efficient, safe and reliable payment system. In the same year, the Bank Indone- 
sia launched a fintech Regulatory Sandbox, allowing fintech companies [Payment 
system development (including blockchain and distributed ledgers); aggregate pay- 
ment; Internet investment management and risk control; Internet insurance; credit, 


Table1 Fintech regulatory authorities and their responsibilities 


Regulatory authorities Specific regulatory responsibilities 


Bank Indonesia Electronic wallet, electronic cash, payment 
gateway, principal, conversion company, card 
issuer and receiver, clearing office, settlement 
agent, virtual currency, blockchain, national 
payment gateway, payment transaction support 


Otoritas Jasa Keuangan (OJK) P2P, crowdfunding. Digital banking, insurtech, 
capital market fintech, venture capital, online 
financing, data security, consumer protection 


Ministry of Information and Communications | Telecommunications, information technology, 
fintech related to information technology 
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financing business, and capital allocation; other financial services (as judged by Bank 
Indonesia).] in six major sectors to perform a six-month test on their services under 
the supervision of the Bank Indonesia. On August 16, 2018, OJK, based on the 
experience of the Bank Indonesia in the fintech Regulatory Sandbox and pre-audit 
mechanism in the payment field, issued the Digital Financial Innovation Regulation 
No. 13/POJK.02/2018, which proposed a package of regulations on fintech supervi- 
sion, established a Regulatory Sandbox system, and filled the blank of Indonesian 
Bank Regulation No. 19/12/PBI/2017. 

Strengthen international fintech cooperation. In October 2018, at the annual meet- 
ing of the International Monetary Fund and the World Bank, the Indonesian govern- 
ment promoted the adoption of the FinTech Agenda. In the same year, Chinas Alibaba 
Cloud announced the establishment of its first data center in Indonesia and officially 
put it into operation. Since then, various Chinese fintech companies and investors 
have entered the Indonesian market. In September 2020, the Securities Commission 
of Malaysia (SC) signed a fintech cooperation agreement with Indonesias Otori- 
tas Jasa Keuangan (OJK) in order to establish a cooperation framework to develop 
fintech ecosystems in the two markets. 


2.8.5 Layout of Key Fintech Cities: Jakarta—The Rapid Development 
of Fintech with Inclusive Finance as the Core 


Jakarta is the capital, the largest city, and the economic center of Indonesia. The 
Greater Jakarta Region surrounding the surrounding towns is the second largest 
metropolitan area in the world. Its industry is dominated by finance, accounting for 
about one-third of the countrys GDP. It has the largest financial and major industrial 
and commercial institutions in the country. Stock exchanges and futures exchanges 
are all located in Jakarta. At the same time, Jakarta is also Indonesias fintech hub 
city. At present, most fintech companies are located in Jakarta (or Greater Jakarta 
Region), and domestic business customers are basically in the same area. 

Jakarta has become the birthplace of fintech companies, the first test site, and the 
first launch site for products and services. Indonesias first fintech unicorn company 
OVO was born in Jakarta, and Akselan, the first equity crowdfunding platform, was 
officially established in Jakarta. Among the numerous fintech companies, more than 
70% are engaged in digital financial inclusion business, mainly providing financing 
and lending services for small and micro enterprises and rural populations. 
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2.9 Hong Kong of China—The Government Assists the 
Strong Development of Fintech 


2.9.1 Development Features: Seek Innovation While Maintaining 
Stability to Build a Fintech Hub 


The enthusiasm for the development of fintech is booming. Hong Kong has a huge 
financial system, relatively complete financial ecology, and favorable conditions for 
the development of fintech. The construction and development of the Guangdong- 
Hong Kong-Macao Greater Bay Area also brings more opportunities for development 
of fintech in Hong Kong. According to statistics from the Hong Kong Investment 
Promotion Agency, there are currently more than 600 fintech companies in Hong 
Kong covering multiple business areas. According to relevant KPMG data and its 
statistics on global investment and financing, Hong Kong, China ranked ninth in 
Asia in terms of investment and financing in 2018. Between 2014 and 2018, the 
total investment of fintech companies operating in Hong Kong amounted to USD 1.1 
billion. In the first half of 2019, fintech companies in Hong Kong raised a total of 
USD 150 million of fund, an increase of 561% on a year-on-year basis. At present, 
Hong Kongs key application areas of fintech involve mobile payment, cross-border e- 
commerce payment, securities payment settlement, online financing platform, wealth 
technology, commercial insurance, etc. 

The internationalization characteristics in various fields of fintech are obvious. In 
terms of payment and settlement, with a wide variety of financial products in Hong 
Kong, coupled with the opening of Shanghai-Hong Kong Stock Connect, Shenzhen- 
Hong Kong Stock Connect, and Bond Connect, post-trade processing platforms have 
huge space for fintech. In terms of wealth management, as an international investment 
and asset management center, Hong Kong has already applied a large number of 
technologies in the field of asset management, such as computerized transactions 
and investments, and still has great potential in automated consulting, big data, and 
artificial intelligence. In terms of cross-border e-commerce in trade field, it involves 
payment and exchange in multiple currencies. Hong Kong has the obvious advantages 
of free currency convertibility and offshore RMB center. Coupled with tax laws and 
other conditions, Hong Kong is still preferred cross-border e-commerce. Relying 
on the above advantages, a large number of cross-border payment companies have 
recently appeared in Hong Kong. In terms of supply chain finance, the blockchain 
trade financing platform is the construction focus in Hong Kong. 

The fintech talent training system is complete. Currently, the main body of culti- 
vating fintech talents in Hong Kong mainly includes two natures, namely, colleges 
and universities and social organizations. In terms of colleges and universities, many 
colleges and universities have set up fintech majors at the undergraduate, master, and 
doctoral levels. The Chinese University of Hong Kong, the University of Hong Kong, 
City University of Hong Kong, and the Open University of Hong Kong offer fintech 
major at the undergraduate level; the Chinese University of Hong Kong, Hong Kong 
University of Science and Technology, Hong Kong Baptist University, etc. offer such 
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major at the masters level; and the Hong Kong Polytechnic University offers fintech 
major. In terms of social organizations, organizations such as the Hong Kong Youth 
Association Continuing Education Center, the Vocational Training Council, the Insti- 
tute of Financial Technologists of Asia, and the Hong Kong Institute of Bankers have 
attracted or strengthened the training of fintech talents in online and offline methods 
by setting up basic fintech courses and issuing certificates. 


2.0.2 Fintech Development Policies and Measures: Innovative Support 
Measures Contribute to Strong Development of Fintech 


The government supervision service system is efficient and perfect. Hong Kong has 
successively established and improved relevant government service systems. First, 
specialized agencies are established and a Regulatory Sandbox is established. The 
government of Hong Kong Special Administrative Region has established an Inno- 
vation and Technology Bureau to coordinate the development of fintech. At the same 
time, the Hong Kong Monetary Authority, the Securities and Futures Commission, 
and the Insurance Regulatory Authority have respectively set up fintech Regulatory 
Sandboxes to provide companies with a pilot-based regulatory environment for the 
application of innovative technologies. At the same time, the government has also set 
up a fast track for Internet insurance sales companies, such as ZhongAn Insurance and 
other online insurance companies, to apply for licenses. Second, the introduction of 
corporate resources is strengthened. The Hong Kong Investment Promotion Agency 
has established a fintech task team to successfully attract 19 fintech companies to 
settle in Hong Kong and provide assistance to more than 310 fintech companies. 
Integrate into the development of the Guangdong-Hong Kong-Macao Greater Bay 
Area. Hong Kong cooperates with companies such as Tencent, etc., and the Hong 
Kong Monetary Authority has issued the first batch of third-party payment licenses to 
them. Through the deployment of WeChat Hong Kong Wallet, Passenger QR Code, 
We Remit and other products, Tencent has well integrated with the advantages of 
Hong Kong and Macau based on its accumulated mobile payment capabilities for 
many years. With the support from all regulatory parties, Tencents E-Pass will give 
priority to pilot virtual multi-certificate integration in the Guangdong-Hong Kong- 
Macao Greater Bay Area, which can meet the needs of residents in Guangdong, Hong 
Kong, and Macau to use a unified digital identity to enjoy multiple services so as to 
realize the interconnection in Guangdong-Hong Kong-Macao Greater Bay Area. 
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A Probability Inequality with R) 
Application to Lattice Theory giecik 


Tian Kun 


Abstract Here we mainly provide a probability inequality about GGH public-key 
encryption scheme. Given a constant o , we first choose a lattice vector v € Z”, anda 
small error vector e is generated satisfying |e| < o. The ciphertext result c could be 
computed by the function fg, (v, e) = Bv + e with a public basis B. To extract the 
message v, the function fg. 1 (c) = B-![c]g will be used based on the private basis 
R. In this work we produce a bound for the error probability of v 4 B^![c]g. We 
also illustrate the way choosing o such that the error probability is arbitrarily small. 


Keywords Probability inequality * Encryption scheme - Lattice 


1 Introduction 


Given a full-rank lattice L C Z", we denote the public basis of L by B and private 
basis of L by R. Both B and R aren x n invertible matrices. In the GGH public-key 
encryption scheme, for a plaintext vector v € Z”, the random error vector e is chosen 
by setting the absolute value of each entry no more than a constant o , where o is a pos- 
itive real number. The ciphertext c is computed by c = fg, (v, e) = Bv +e € R^. 
Using the results of BaBai and some other ones (Ajtai, 1996; Ajtai & Dwork, 1997; 
Babai, 1986; Coppersmith & Shamir, 1997; Goldreich et al., 1997; Micciancio, 2001; 
Hoffstein et al., 2017, 1998), we can decipher the plaintext v — B-![c]g given B, 
R and ciphertext c. Here the lattice point [c] ng is obtained by representing c as a 
linear combination on the columns of R and rounding the coefficients in this linear 
combination to the nearest integers. The problem is that how o should be chosen 
so that we can get a right plaintext v or guarantee a low error probability. We show 
three theorems to solve this problem. A probability inequality is given to estimate 
the bound of inversion error probability. 
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2 Main Results 


Theorem 1 B is the public basis and R is the private basis of lattice L. v € Z", e 
is the random eur Mar lelo € ie c= fp (v, e) = Bv + e. Then B-![c]g =v 
if and only if [R — 0, pore [n [R e] denotes the vector in Z” which is obtained by 
rounding each e in R^ le to the nearest integer. 


Proof Let T = B^! R, then 
Bo [cle = B [Bv + e]g = B! RR (Bv + e)] = TIT v + Re] 


since T = B-!R is a unimodular matrix, T~! is also a unimodular matrix. v € Z”, 
so T-!lv e Z^. 
B^[c]g = TIT ^v + Re] = v + T[R le] 


Thus B^![c]g = v is equivalent to T[R~'e] = 0, and this equality holds if and only 
if [R^ le] = 0. 


Remark 1 This theorem gives an equivalent condition to check whether the decryp- 
tion result is accurate. 


Theorem 2 Let R be the private basis of lattice L. e is the random error vector 
such that |e|s; < o. Suppose the maximum L norm of the rows in R^! is p. Then if 


o< oe [Rte] = 0 holds. 


Proof Let Ro = (cij)nxn, R'e = (a1, a2, ..., an)" IS S x T IMEEM E 
n. 
n 
lail =|) ege; < je eu < op < = 
j=l 


This means that [Rte] = 0. 


Remark 2 Theorem 2 shows how o can be chosen so that no inversion error occurs. 


Theorem 3 Let ann x n matrix R be the private basis used in the inversion of fso, 

B . =i ‘a Zn 
and denote the maximum Læ norm of the rows in R^" by | Then the probability 
of inversion errors is bounded by 


P([R7!e] £0} < 2n- exp (-sa3) 


802r2 


here e = (ei, €2, ..., €n)! and ei, e2, ..., e, aren independent random variables such 
that |ej| € o and E(ej) = O forl SLi Sn. 
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Lemma 1 For any non-negative random variable X with finite expectation E(X) 
and any positive real number u, we have 


Proof Here we treat X as a random variable of continuous type. For the other situ- 
ations, the proof is similar. Let f (x) be the probability density function of X. Since 
E(X) = fo^ xf (x)dx 2 fi^ xf dx > fT” uf G)dx = uP(X > p), then we 
have To u} «X zD, 


Lemma 2 Given random variable X satisfying —a < X < a with E(X) = 0, here 
a > 0. For any real number à, we have 


242 
E < exp (2 ) 


Proof For any real number A, f (x) = e^ is a convex function. Notice that 


x+a iawn 
|a 
2a 2a 


E e= 


'(-à) -agx<a 


then 


foe « DT f(a) + E Gu) 
a 2a 


x+a a-x _ 
e x e^ -L e ^4 


2a 2a 


X+a a— x 
+ e^ 34 e 


El”) «E 
e (Ta 2a 


Let t = ia, next we prove that $ (e Lez exp(5). This inequality is equivalent 
to 


t —t t2 
In e+e <É 
2 2 
Let g(t) = 5 — In 7 , then g'(r) = t — 45 and g'(0) = 0. Since g” (t) > 0, we 


get g'(t) < ift < ando) (t) > Oif t > 0. Then g(t) 2 g(0) = 0 and we complete 
the proof. 


Lemma 3 Suppose X,, X», ..., X, are n independent random variables. For 1 < 
i «in, we have —a < X; < aand E(X;) = 0, herea > 0. Let S, = X cdi Xie 0, 


then 
2 


€ 
PS, | > E} x 2exp(— 2—;). 
2na 
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Proof For any à > 0, based on Lemma 1, we can get 


E(e^* 
PIS, > 6) = Plo > en < EC 
e È 
Since X1, X», ..., X, are independent random variables, combine with Lemma 2, 


n 2 


E(e) = [res < [-* 


AS, 
E (e^?) < eon Se 


P($, 2 £} < —z 


e 


Let 4 = =u, therefore, the above inequality becomes to 


E 
P(S, 2 £} < exp (-zs ;) 
na 


In the same way, we can prove that 


Thus 
g2 
P{|S,| 25] <2 -— 
(S > €} exp ( zz) 


Proof of Theorem 3. Now we can prove Theorem 3 given at first according to Lemma 
3 

Let R^! = (Cij)nxns € = (€1, €2, ..., €n)” , here ei, €2, ..., e, are n independent ran- 
dom variables pu that |e;| < o arid E(ej) =Oforl <i<n. 

We denote R~!e = (a1, a5, ..., dn)’, i.e., aj = P» Cie; 1 & i n. 

Since |cij| < a and |e;| < o, then the random variable cj;e; is limited to the interval 


[-77, Fal- Based on Lemma 3, 
1 Gy 1 
Pí(l|ai| > a 3! = P(| Yel E i dep n RED = 2c 80272 


z 1 L 1 
-1 
e] # 0} < ? Pillai >5}< < Z Ptal > > 5} € 2n exp- 37) 


Thus the inequality in Theorem 3 holds. 
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= 
Corollary 1 P([R^!e] £0} «e ifo < (2m z) ] 
-1 
Proof o « (2/22) <> 2n. exp (73252) « €, from Theorem 3, 


P([R le] Z 0) < 2n - exp (n) «E 


Remark 3 Theorem 3 provides a way to estimate the bound of inversion error prob- 
ability, and Corollary 1 gives a detailed bound for o based on Theorem 3 to get the 
error probability no more than a constant e. 


3 Conclusions 


In this work we mainly present a probability inequality about GGH public-key 
encryption scheme. In this scheme, we first take a lattice vector v € Z" and gen- 
erate a small error vector e such that |e| < o. Given a public basis B, the function 
fB.o (v, e) = Bv + e computes the ciphertext result c. To decrypt, the private basis 
R and the function fp p (c) = B-![c]g will be used to extract the message v. We 
give a bound for the error probability of v 4 B~'[c]r and explain how to choose o 
in order to obtain the error probability no more than a given constant e. 
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Abstract For complex diseases, beyond the main effects of genetic (G) and envi- 
ronmental (E) factors, gene-environment (G-E) interactions also play an important 
role. Many of the existing G-E interaction methods conduct marginal analysis, which 
may not appropriately describe disease biology. Joint analysis methods have been 
developed, with most of the existing loss functions constructed based on likelihood. 
In practice, data contamination is not uncommon. Development of robust methods 
for interaction analysis that can accommodate data contamination is very limited. 
In this study, we consider censored survival data and adopt an accelerated failure 
time (AFT) model. An exponential squared loss is adopted to achieve robustness. 
A sparse group penalization approach, which respects the “main effects, interac- 
tions" hierarchy, is adopted for estimation and identification. Consistency properties 
are rigorously established. Simulation shows that the proposed method outperforms 
direct competitors. In data analysis, the proposed method makes biologically sensible 
findings. 
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1 Introduction 


For many complex diseases, it is essential to identify important risk factors that are 
associated with prognosis. In the omics era, profiling studies have been extensively 
conducted. It has been found that, beyond the main effects of genetic (G) and environ- 
mental (E) risk factors, gene-environment (G-E) interactions can also have important 


implications. 
Denote T and C as the prognosis and censoring times, respectively. Denote X — 
(X1, ..., X4)! as the q environmental/clinical variables, and Z = (Zi,.... Zp)" 


as the p genetic variables. The existing G-E interaction analysis methods mainly 
belong to two families. The first family conducts marginal analysis (Hunter, 2005; 
Shi et al., 2014; Thomas, 2010), under which one or a small number of genes 
are analyzed at a time. Despite its significant computational simplicity, marginal 
analysis contradicts the fact that the prognosis of complex diseases is attributable 
to the joint effects of multiple main effects and interactions. The second fam- 
ily of methods, which is biologically more sensible, conducts joint analysis (Liu 
et al., 2013; Wu et al., 2014; Zhu et al., 2014). Among the existing joint analy- 
ses, the regression-based is the most popular and proceeds as follows. Consider 
the model T ~ $ (ato + 75, X jo; + 35a Zak + 5 2o Xj Zky jo, where 
model $ (-) is known up to the regression coefficients a, (o;)1. {8} , and (yj.)1 ^ . 
Conclusions on the importance of interactions are drawn based on (y; x n . With the 
high data dimensionality and demand for the selection of relevant effects, regularized 
estimation is usually needed. 

In the dominating majority of the existing studies, estimation is based on the stan- 
dard likelihood, which is nonrobust. In practice, data contamination is not uucommon 
and can be caused by multiple reasons. Many diseases are heterogeneous, and differ- 
ent subtypes behave differently. When the subtype information is accurately avail- 
able, subtype-specific analysis can be conducted. However, when such information is 
not or partially available, which is often the case in practice (He et al., 2015), subjects 
belonging to small subtypes may be viewed as contamination" to those ofthe leading 
subtype. Human errors can also happen. It has been well noted that survival infor- 
mation extracted from medical records is not always reliable (Bowman, 2015; Fall 
et al., 2008), creating contamination in prognosis distributions. In low-dimensional 
biomedical studies, it has been well established that even a single contaminated 
observation can lead to biased model estimation and so false marker identification 
(Huber & Ronchetti, 2009). Our literature review suggests that in the analysis of 
G-E interactions, robust methods that can effectively accommodate contamination 
in prognosis outcomes have been very rare. For marginal interaction analysis, a few 
robust methods, for example, the multifactor dimensionality reduction (MDR), have 
been developed. However, they are not directly applicable to joint analysis because 
of both methodological and computational challenges. As discussed in (Wu & Ma, 
2015), a handful of robustness studies have been conducted under high-dimensional 
settings for joint analysis. However, they are mostly on main effects and not directly 
applicable to interaction analysis because of the additional complexity caused by the 
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"main effects, interactions" hierarchy. Most of them adopt the quantile regression 
technique. Studies under low-dimensional settings suggest that no robust technique 
can dominate. It is thus desirable to examine alternative robust techniques under 
high-dimensional settings. In addition, for quite a few existing methods, statistical 
properties have not been well studied, casting doubts on their validity. 

Consider data with a prognosis outcome and both G and E measurements. Our 
goal is to conduct joint analysis and identify important G-E interactions and main 
G and E effects. This study advances from the literature in multiple aspects. Specif- 
ically, we consider the scenario with possible contamination in the prognosis out- 
come, which is commonly encountered but little addressed. We adopt an exponential 
squared loss to achieve robustness. This loss function provides a useful alternative 
to the popular quantile regression and other robust approaches but has not been well 
investigated under high-dimensional settings, especially not for interaction analysis. 
This study also marks a novel extension of the exponential squared loss to accom- 
modate censored survival data. For regularized estimation and selection of relevant 
effects, we propose adopting a penalization technique, which respects the “main 
effects, interactions" hierarchy. Significantly advancing from most of the existing 
studies, consistency properties are rigorously established. Theoretical research for 
high-dimensional robust methods remains limited. As such, this study may provide 
valuable insights. With both methodological and theoretical developments, this study 
is warranted beyond the existing literature. 


2 Methods 


2.1 Data and Model Settings 


For describing prognosis, we adopt the AFT model, which has been the choice of 
multiple studies with high-dimensional genetic data (Liu et al., 2013; Shi etal., 2014). 
Compared to alternatives including the Cox model, advantages of the AFT model 
include intuitive interpretations and low computational cost, which are especially 
desirable with high-dimensional genetic data. With a slight abuse of notation, still 
use T and C to denote the logarithms of the event and censoring times, and ô = (rc. 
The AFT model specifies that 


p q 


q p 
T =+) Xjo; +) Zbe+ dD XiZevja e 
j=l 


k=1 j=l k=1 


where e is the random error. Following Stute (1993, 1996), we assume that T and 
C are independent, and ô is conditionally independent of (X', ZT)" given T. Let 
Wr = (Zk, Xi Zk, ..., XZ)! and by = (Bk, Viks ..., Yax) s which represent all 
main and interaction effects corresponding to the kth genetic variable. 
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With n independent subjects, use subscript “i" to denote the ith subject. For 
subject 7, let y; = min{7;, C;} and ô; = Ir, «c; be the observed time and event indi- 
cator, respectively. Then the ith observation consists of (y;, ôi, Xj, Zi), with x; = 


(Xr, -<-s Xiq) i = (Zil, ++, zip), and Wy = (Zik, Xi1Zik» <- <, XiqZik) denoting 
the ith realization of X, Z, and Wz, respectively. Denote ul = (1, x. Wil, "m Wid 
U-(u,---,u,)', and ¢ = (eo, ..., Æg, b], ..., br). Without loss of general- 


ity, assume that (y;, ô;, u; )'s have been sorted according to y;'s in an ascending 
manner. 


2.2 Robust Estimation and Identification 


Consider the scenario where the distribution of € is not specified, which significantly 
differs from the existing parametric studies and makes the proposed method more 
flexible. To motivate the proposed estimation, first consider data without contami- 
nation. Stute (1993) developed a weighted least squared estimation approach. Under 
low-dimensional settings, Stute's estimator is defined as the minimizer of the loss 
function 


oio: — ul ty. 


i=l 
Here the weights w = (œ;)} 
and defined as 


3 à; i-1 po 3; 
o ==% =——_]] J| 22.2. 
n n-it+l n—j-ci 


, are computed based on the Kaplan-Meier estimation 


Itis noted that Stute's estimator is not necessarily the most efficient. However, under 
high-dimensional settings, it can be computationally the most convenient with the 
least squared loss. It can be seen that, if c; 4 0, one contaminated y; can lead to 
severely biased model estimation. 

Now consider the scenario with possible outliers in the prognosis data. We propose 
the objective function 


Qu) = Y oi exp(—(i — uj C) /0). (1) 


i-l 


This function has been motivated by the following considerations. Under low- 
dimensional regression analysis without censoring, (Wang et al., 2013) adopted an 
exponential squared loss to achieve robustness. The intuition is as follows. For a con- 
taminated subject with the observed y; deviating from uj (the "predicted" value 
based on the model), (y; — ul zy? has a large value. The exponential function down- 
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weighs such a contaminated observation. The degree of down-weighing is adjusted 
by 0: when 0 gets smaller, the contaminated observations have smaller influence. 
While sharing certain similar ground as (Wang et al., 2013) and others, the present 
study has three main challenges/advancements. The first is the high dimensionality, 
which brings tremendous challenges to theoretical and computational developments. 
The second is the need to respect the “main effects, interactions” hierarchy (more 
details below). The third is censoring, to accommodate which we introduce the weight 
function w; motivated by Stute's approach. As the weights are data-dependent, they 
bring challenges to the establishment of theoretical properties. 

When p > n, regularized estimation is needed. In addition, out of a large number 
of profiled G factors and G-E interactions, only a few are expected to be associated 
with prognosis. We adopt penalization for regularized estimation and identification, 
which has been the choice of a large number of genetic studies, especially recent 
interaction analyses (Bien et al., 2013; Liu et al., 2013; Shi et al., 2014). Specifically, 
consider the penalized robust objective function 


p p qti 
Lis aaa) = Qe) — 3 oll; 1.5) 3 3 obl Aa s), D 
k=1 k=1 j=2 


where || - || is the £? norm, p(t; A, s) = à fo. (1 — 4), dx is the MCP (minimax 
concave penalty, (Zhang, 2010)), and bz; is the jth element of by. A; and Az are 
data-dependent tuning parameters, and s is the regularization parameter, per the 
terminologies in Zhang (2010). The robust estimator is defined as the maximizer 
of L;, 3,0 (£). An interaction term (or main effect) is concluded as important if its 
estimate is nonzero. 

In recent genetic interaction analysis, it has been stressed that the “main effects, 
interactions" hierarchy should be respected. That is, if an interaction term is identified 
as important, its corresponding main effect(s) should be automatically identified. 
G-E interaction analysis has its uniqueness. The E variables usually have a low 
dimensionality and are manually chosen. As such, selection is usually not conducted 
on the E variables (if desirable, this can be easily achieved). Thus for G-E interaction 
analysis, the hierarchy postulates that if an G-E interaction is identified as important, 
its corresponding main G effect is automatically identified. In the adopted sparse 
group penalty, the first penalty, which is a group MCP, determines which groups are 
selected. Here one group corresponds to one genetic variable and its interactions. As 
the group MCP does not have within-group sparsity, the second penalty is imposed, 
where we penalize the interaction terms and determine which are nonzero. With the 
special design that the second penalty is only imposed on interactions, important 
interactions correspond to important groups, automatically leading the estimates of 
the corresponding main G effects nonzero. As such, the combination of the two 
penalties guarantees the hierarchy. We note that although sparse group penalization 
has been studied in the literature (Liu et al., 2013), it has been very rarely coupled 
with robust loss functions. It is also noted that MCP can be potentially replaced by 
other penalties. 
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2.3 Computation 


In this section, we develop an efficient algorithm to compute the maximizer of 
L3,.34,0 (C£). The basic strategy is to iteratively approximate the objective function 
by its quadratic minorization. Then a coordinate-wise updating procedure is used to 
find the maximizer of each approximated objective function. The maximizer then 
serves as the starting point for the next minorization. Overall, this is a coordinate- 
descent (CD) algorithm nested in a Minorize-Maximization (MM) algorithm. 

Let W(¢) beadiagonal matrix with the ith diagonal element W; ; = 2; exp(—(yj — 
uj ¢)?/0)/0. Also let v(Z) = (vi, ++- , v)! with v; = yi— u, ¢. Define U,..; as the 
sub-matrix of U with the jth column excluded. Define u,; as the jth column of 
matrix U, and u; j as the jth component of vector uj. Similarly, define ¢_; as the 
sub-vector of ¢ with the jth element excluded. For the exponential squared objective 
function in (1), its first- and second-order derivatives with respect to ¢ are 


ə n = 
s = 2) je exp(—(yi — uj ¢)?/0)ui j Oi — uj £)/0 = uj WOVE), 
J i=1 
5 n 
3 Qu) Qo) = p» wi exp(— (yi — u; £)/0)u; juil Oi x uj ¢)°/0 — 1j/6. 
Ət jð Ek = l 


If (y; — u} £)?/0 > 0.5, ZAW > 0. On the other hand if (y; — ul ¢)?/0 < 0.5, 


[ISI 
2. 
ES < 0. Hence, to find the maximizer of Qo(Z), the simple Newton-Raphson 
n 


approach may lead to infinity if the starting value is too far from the true value. 
To tackle this problem, a minorization of Qg(¢) is used to approximate Q,(C). 
Note that eu > —2Y 540 exp(—(y; — uj C)/0)ui jui k/0. Hence a minorized 
MAN INC to Qo (£Z) at £" is 


1 
Qe(z") HV (c")W(t")UG — c") — sc tU) UT Wg"yuc - zn. 


Note that £" = (af, ..., 07. Int. ssi bmt)! with? = (Bt, Vio -> vg) -For 
the penalty, we apply a local linear approximation at ¢”, which is given by 


mi m Al 


ly 
IBxl — D Clb M: Ar. "up + dy: jon a 


k=1 j=l 


iB? 
— b? |: X 
You ll; DÉ 


if the terms that do not depend on ¢ are ignored, where b(t; A, s) = sgn(t) ( — m. 


If we replace Qg(¢) in (2) with its minorized approximation and plug in the approx- 
imation of the penalty, the penalized objective function then has the form 
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Ly, azole") = QG") + v G')WQ")U( — 6") 


1 P m 
= FE = PU wa" - e - Y AUBE A D ETB 
k=l k 
xw. Ir zl i 
25393 [oer Ar Sigay Pali Aa, J lvjal- (3) 
k 


k=1 j=l 


This function has a “weighted quadratic + penalty” form and can be optimized using 
the coordinate-descent approach. 

The algorithm starts with m = 0 and ¢” = 0, where m is the index of the MM 
iteration. At iteration m, the objective function is approximated by its minoriza- 
tion Ly,..,,6(¢|¢”) given in (3). Then the penalized weighted quadratic function is 
maximized using the coordinate-descent algorithm. Denote Z^'7 as the estimate of ¢ 
before updating. We update each element of the estimate and denote the new estimate 
as C"*". This is repeated until the distance between £^ and Z"*" is smaller than a 
prefixed constant. Then ¢”+! = Z"*" serves as the new expansion base point for the 
next minorization. The overall procedure is repeated until convergence. Convergence 
properties of the MM and CD techniques have been well studied in the literature. 
With our problem, the objective function increases at each step and is bounded above, 
which leads to convergence. In numerical study, we conclude convergence if the dif- 
ference between two estimates after two consecutive MM steps is small enough. We 
Observe convergence in all numerical examples after a small to moderate number of 
MM iterations. 

The proposed method involves tuning parameters. For s in MCP, we follow 
(Zhang, 2010) and other published studies, which suggest examining a small number 
of values or fixing it. In our numerical study, we fix s = 6, which has been adopted in 
published studies (Shi et al., 2014; Xu et al., 2018). We have also examined s values 
near 6 and observed similar performance (details omitted). In practice, for settings 
significantly different from ours, other s values may need to be considered. Under 
low-dimensional settings, (Wang et al., 2013) proposed an iterative approach to select 
the robust tuning parameter 0. However, their approach is computationally infeasi- 
ble for high-dimensional data. Under the present setting, for each combination of 
(41, A2, 0), we compute the solution. This way, we can obtain a solution surface over 
a three-dimensional tuning parameter grid. This is feasible as the proposed compu- 
tational algorithm only involves simple updates and incurs low cost. Then the tuning 
parameters can be selected using a prediction-based method which proceeds as fol- 
lows: (a) compute the cross-validated sum of prediction errors for each (A1, A2, 0) 
combination; (b) for each fixed 0, average the sum of prediction errors over A,, A2. 
Select 0 that has the smallest average sum of prediction errors; (c) with the selected 
0, select A1, Az that has the smallest sum of prediction errors. This procedure first 
groups all (A, Az) values together and selects the best 0 value. Then with the optimal 
0 value, the optimal (A, Az) values are selected. Our numerical experiments suggest 
that this procedure generates more stable estimates than directly searching over the 
three-dimensional (A, Az, 0) grid. 
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With a complex robust goodness-of-fit and a penalty that respects the hierarchy, 
the proposed method is inevitably computationally more expensive than some sim- 
pler alternatives. However, as the proposed computational algorithm is composed 
of relatively simple calculations, the overall computational cost is affordable. With 
fixed tunings, the analysis of one simulated dataset (described in detail below) takes 
about nine minutes on a regular laptop. Tuning parameter selection can be conducted 
in a highly parallel manner to save computer time. 


2.4 Consistency Properties 


In this section, we rigorously prove that the proposed method can consistently identify 
the important interactions (and main effects) under ultrahigh-dimensional settings. In 
the literature, theoretical development for robust methods under high-dimensional 
settings has been limited. It is especially rare for methods other than the quantile 
based. With the consistency properties, the proposed method can be preferred over 
the alternatives whose statistical properties have not been well established. Our the- 
oretical development not only provides a solid ground for the proposed method but 
also sheds insights for other robust methods under high-dimensional settings. 


For any two subsets Sı and S5 of {1,--- , p -- q + pq + 1) and a matrix H, we 
denote by Hs, s, the sub-matrix of H with rows and columns indexed by 5; and 55, 
respectively. Let ¢* = (o, ..., d. pi "x 2A IN where bt = (B, Vi pects Yr) 


is the true value of ¢. Here we make the sparsity assumption, under which only a 
subset of the components of ¢* is nonzero. Define the three groups of parameters: 


A1 = (aj. .... ag}, AZ —iyiyiy #0 j= 1,...,q;k=1,..., p}, 
A3 = (Bi : Bk Æ 0 or there exsits some 1 < j < q such that yj, #0,k=1,..., p}. 


Denote .& as the set of indices of A; U A, U A; in the vector ¢*. Let ° and |. | 
denote the complement and cardinality of set æ, respectively. We then divide 2° 
into there sets of indices 21, 42, and 43, which correspond to the following three 
sets 


Bı = (f : Bj =0,k =1,..., p}, 
Bo = {y}; yj =O but Br #0, j — Ll... q;ik— Lb... p) 
B3 = [yj : vj, =Oand Bf 20, j — 1... q;ik — Lh... p}, 


respectively. Define 


Z 2(y; — uj £) 
D,G) = Y o exp(—( = /0) =, 
i=1 


and 
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pm. 2(y; — ul ¢)? 
hO= 7 X wi exp(—(y; — uj 0/0) [zm = ) u;uj. 
i=l 


The following conditions are needed to establish the consistency properties. 


Cl. T and C are independent, and P(T < C|T, X, Z) = P(T x CIT). 

C2. The support of T is dominated by that of C. For example, tr « tc or tr = 
Tc = œ, where tr and tc are the right end points of the support of T and C, 
respectively. 

C3. E[D,(£*)] = 0. 

C4. The distributions of D, ;(£*)'s are subgaussian, that is, Pr(|D,,;(¢*)| > t) < 
2exp (—nt?/o). Moreover, In, jx(¢) — Ij4(£)'s are subgaussian for all ¢ € 
© —(z:|li&—zZ*llo < ô}, where ô is a positive constant, /(¢) = E[/,(C)], 
and /;,(Z) is the (j, k)th component of matrix /(¢). Moreover, there exists 
a bounded constant « such that v'[1(z!) — 1(¢7)]v < «lig! — Z?|[; for any 
¢!,¢? € O and ||v||; = 1. 

C5. Ij 4 (£*)isa|s| x |«7| negative-definite matrix. The eigenvalues of Ly y (£*) 
are bounded away from zero and infinity. 

C6. min; (17,1 : yj, FO} D> Ar V Ao. AL ^ Aa > AW /n. 


C1 and C2 have been commonly assumed in the literature. See, for example, (Stute, 
1993, 1996; Huang et al., 2007). We note that the independent censoring assumption 
usually holds in practice, although from a theoretical perspective, quite a few stud- 
ies have made the weaker conditional independence assumption. We have explored 
relaxing this assumption and found that alternative and less intuitive assumptions 
would have to be made. The zero expectations in C3 and C5 ensure the consistency 
of estimation. C4 is required for Theorem 1, and a similar assumption has been made 
in (Ma & Du, 2012). C6 requires that the smallest signal does not decay too fast, 
which is common in studies on high-dimensional inference. The following theorem 
establishes consistency of the proposed estimator [2 

Theorem 1 Suppose that conditions C1-C6 hold. 

Let a, = Qa ^ à2)/{max (D1, d», $5)), where D, = |Lg,u (La (57) Moos 

t = 1,2,3. f|% | = o(n), 41 V A5 > 0, no? — œ, and log p = o(nw?), 

with probability tending to one, we have 


(a) (Ce —t£&lo — Os] ag ]/ns (b) Core — 0. 


Proof For the proof, see Appendix. 


This theorem establishes that the proposed method is able to accommodate p with 
log p = o(n m2). The penalized robust estimator enjoys the same asymptotic prop- 
erties as the oracle estimator with probability approaching one. This property holds 
under high dimensions without restrictive conditions on the errors. To the best of our 
knowledge, properties of the robust exponential loss, even without censoring, have 
not been studied under high-dimensional settings. Thus our theoretical investigation 
can have independent value. Proof of the theorem is presented in Appendix. 
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3 Simulations 


In simulation, we setn = 300, q = 5, and p = 1000. The underlying true model con- 
tains a total of 35 nonzero effects, including 5 main E effects, 10 main G effects, and 
20 interactions. The "positions" of nonzero main G effects are randomly placed. The 
nonzero interactions are generated to respect the “main effects, interactions" hierar- 
chy. The nonzero regression coefficients are randomly generated from uniform (0.7, 
1.3). We consider both continuous and categorical distributions to mimic, for exam- 
ple, gene expression and SNP data. Specifically, under the continuous scenario, the 
E and G factors are generated from multivariate normal distributions with marginal 
means zero, marginal variances one, and the following variance matrix structures: 
Independent, AR(0.3), AR(0.8), Band(0.3), Band(0.6), and CS(0.2). Under the inde- 
pendent scenario, all factors have zero correlations. Under the AR(p) correlation 
structure, for the ith and jth factors, corr = p!'—/!. Under the Band(p) correlation 
structure, for the ith and jth factors, corr = p: I (|i — j| = 2) -03- I (|i — j| = 
1) + I([i — j| = 0). Under the CS(p) correlation structure, for the ith and jth fac- 
tors, the correlation coefficient corr = o! 7. Under the categorical scenario, we 
first apply the same data generating approach as described above to obtain U. Then for 
each u; j, the categorical measurement is generated as J (uj, ; > —0.7). The threshold 
value —0.7 is chosen such that the proportion of 1’s for each factor is roughly 75%. 
Under each of the above simulation settings, consider the random error distribution 
(1 — €)N(O, 1) + €Cauchy, with the contamination probability £ = 0, 0.1, and 0.3. 
When & = 0, the error distribution has no contamination and favors the nonrobust 
approaches, while the latter two values lead to different levels of contamination. 
The log event times are generated from the AFT model. The censoring times are 
generated independently from Weibull distributions. The censoring parameters are 
adjusted so that the censoring rates are about 25%. Beyond the above scenarios, we 
also consider a set of parallel scenarios, under which there are 10 main E effects, 
20 main G effects, and 40 interactions (that is, the number of important effects is 
doubled), and the nonzero coefficients are generated from uniform (0.4, 0.6) (that is, 
the signal levels are reduced by about 5096). Other settings remain the same. 

The simulated data are analyzed using the proposed method. In addition, we also 
consider two alternatives: (a) the nonrobust method that adopts the weighted least 
squared loss and the same penalty as the proposed, and (b) the quantile regression- 
based method that adopts an Lı robust loss and the same penalty as the proposed. 
We note that multiple other methods are potentially applicable. Comparing with 
the nonrobust method can directly establish the merit of being robust. The quantile 
regression-based approach is the most popular for high-dimensional data (Wu & Ma, 
2015). Thus these two alternatives are the most sensible to compare with. 

All three methods involve tuning parameters. To eliminate the (possibly differ- 
ent) effects of tuning parameter selection on identification accuracy, we consider a 
sequence of tuning parameter values, evaluate identification accuracy at each value, 
and calculate the AUC (area under the ROC curve) as the overall measure. This 
approach has been adopted extensively in published studies (Zhu et al., 2014). 
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Summary statistics are computed based on 500 replicates. The AUC results for 
interactions and main effects combined are presented in Tables 1 and 2, respectively, 
for the scenarios with 35 and 70 important effects. To be thorough, we have also 
evaluated identification accuracy for interactions and main effects separately and 
present the AUC results in Tables 4, 5, 6 and 7 in Appendix. For all three methods, 
the AUC value decreases as the contamination proportion increases, as expected. In 
Table 1, the proposed method outperforms the two alternatives under all except one 
scenario. In Table 2, it dominates the alternatives. Under some scenarios, the proposed 
method leads to a significant improvement in identification accuracy. For example 
in Table 1, with the continuous G distribution, 3096 contamination, and Band(0.3) 
correlation, the proposed method has a mean AUC of 0.901, while the alternatives 
have mean AUCs of 0.761 and 0.789. Compared to the nonrobust alternative, the 
proposed method also has smaller standard errors (Table 3). 

We have also experimented with a few other scenarios and made similar observa- 
tions. In particular, we have examined the scenarios where the event and censoring 
times have weak to moderate correlations and observed similar satisfactory per- 
formance (details omitted). The proposed method and two alternatives respect the 
hierarchy. We have also looked into simpler alternatives, including MCP and Lasso, 
which may violate the hierarchy, and observed inferior performance. 


4 Analysis of the TCGA Lung Adenocarcinoma Data 


Adenocarcinoma of the lung is the leading cause of cancer death worldwide. Profiling 
studies have been extensively conducted searching for its prognostic factors. Here 
we analyze the TCGA (The Cancer Genome Atlas Research Network, 2014) data on 
the prognosis of lung adenocarcinoma. The TCGA data were recently collected and 
published by NCI and have high quality. The prognosis outcome of interest is overall 
survival. The dataset contains measurements on 43 clinical/environmental variables 
and 18,897 gene expressions. There are a total of 468 patients, among whom 117 died 
during follow-up. The median follow-up time is 8 months. We select four E factors 
for downstream analysis, namely, age, gender, smoking pack years, and smoking 
history. These factors have a relatively low missing rate in the TCGA dataset and 
have been previously suggested as potentially related to lung cancer prognosis. There 
are a total of 436 samples with both E and G measurements available. Among them, 
110 died during follow-up, and the median follow-up time is 23 months. For the 326 
censored subjects, the median follow-up time is 6 months. In principle, the proposed 
method can directly analyze all of the available gene expressions. To improve stability 
and reduce the computational cost, we conduct marginal prescreening. Specifically, 
genes are screened based on their univariate regression significance (p-value less 
than or equal to 0.1) and interquartile range (above the median of all interquartile 
ranges). Similar prescreenings have been adopted in the literature. A total of 819 
gene expressions are included in the downstream model fitting. Note that with the 
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Table1 Simulation: identification of both G-E interactions and main G effects. In each cell, mean 
AUC (se). There are a total of 35 nonzero effects, with coefficients uniform (0.7, 1.3) 


E Cor Proposed Nonrobust Quantile 

Continuous 

0 AR(0) 0.891(0.065) 0.842(0.095) 0.806(0.043) 
AR(0.3) 0.971(0.050) 0.917(0.096) 0.832(0.025) 
AR(0.8) 0.981(0.041) 0.923(0.066) 0.881(0.024) 
Band(0.3) 0.972(0.057) 0.908(0.106) 0.828(0.024) 
Band(0.6) 0.978(0.044) 0.930(0.078) 0.725(0.024) 
CS(0.2) 0.920(0.069) 0.827(0.096) 0.854(0.024) 

0.1 AR(O) 0.824(0.077) 0.733(0.114) 0.782(0.042) 
AR(0.3) 0.951(0.057) 0.858(0.130) 0.815(0.031) 
AR(0.8) 0.970(0.061) 0.841(0.119) 0.873(0.034) 
Band(0.3) 0.945(0.093) 0.850(0.143) 0.802(0.021) 
Band(0.6) 0.959(0.058) 0.865(0.131) 0.704(0.037) 
CS(0.2) 0.898(0.087) 0.779(0.109) 0.846(0.043) 

0.3 AR(0) 0.769(0.086) 0.646(0.097) 0.775(0.045) 
AR(0.3) 0.889(0.101) 0.742(0.147) 0.788(0.021) 
AR(0.8) 0.942(0.077) 0.754(0.124) 0.856(0.025) 
Band(0.3) 0.901(0.093) 0.761(0.140) 0.789(0.022) 
Band(0.6) 0.924(0.075) 0.785(0.131) 0.691(0.041) 
CS(0.2) 0.845(0.092) 0.661(0.117) 0.831(0.045) 

Categorical 

0 AR(0) 0.890(0.062) 0.838(0.092) 0.778(0.045) 
AR(0.3) 0.963(0.054) 0.913(0.093) 0.802(0.028) 
AR(0.8) 0.975(0.041) 0.918(0.068) 0.843(0.021) 
Band(0.3) 0.971(0.041) 0.932(0.080) 0.787(0.033) 
Band(0.6) 0.972(0.039) 0.925(0.079) 0.702(0.042) 
CS(0.2) 0.917(0.082) 0.818(0.097) 0.822(0.047) 

0.1 AR(O) 0.835(0.085) 0.756(0.115) 0.749(0.043) 
AR(0.3) 0.944(0.055) 0.856(0.130) 0.785(0.033) 
AR(0.8) 0.970(0.037) 0.867(0.102) 0.831(0.041) 
Band(0.3) 0.953(0.052) 0.862(0.119) 0.764(0.025) 
Band(0.6) 0.965(0.044) 0.861(0.128) 0.678(0.032) 
CS(0.2) 0.895(0.086) 0.752(0.115) 0.803(0.035) 

0.3 0.771(0.090) 0.635(0.118) 0.738(0.043) 


0.895(0.087) 


0.722(0.131) 


0.771(0.024) 


0.946(0.057) 


0.785(0.119) 


0.817(0.028) 


Band(0.3) 0.897(0.115) 0.748(0.153) 0.741(0.027) 
Band(0.6) 0.921(0.083) 0.751(0.140) 0.649(0.047) 
CS(0.2) 0.822(0.110) 0.660(0.113) 0.787(0.031) 
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Table 2 Simulation: identification of both G-E interactions and main G effects. There are a total 


of 70 nonzero effects, with coefficients ~uniform (0.4, 0.6) 


E Cor Proposed Nonrobust Quantile 
Continuous 
0 AR(0) 0.678(0.043) 0.649(0.042) 0.645(0.051) 
AR(0.3) 0.800(0.045) 0.768(0.057) 0.771(0.046) 
AR(0.8) 0.916(0.058) 0.828(0.054) 0.812(0.044) 
Band(0.3) 0.827(0.057) 0.781(0.067) 0.787(0.039) 
Band(0.6) 0.865(0.052) 0.816(0.061) 0.719(0.025) 
CS(0.2) 0.717(0.065) 0.645(0.047) 0.668(0.037) 
0.1 AR(O) 0.651(0.040) 0.623(0.047) 0.619(0.045) 
AR(0.3) 0.737(0.069) 0.668(0.085) 0.672(0.038) 
AR(0.8) 0.892(0.050) 0.779(0.081) 0.795(0.047) 
Band(0.3) 0.790(0.061) 0.710(0.100) 0.759(0.055) 
Band(0.6) 0.827(0.060) 0.767(0.080) 0.754(0.041) 
CS(0.2) 0.691(0.061) 0.613(0.053) 0.672(0.042) 
0.3 AR(0) 0.605(0.052) 0.551(0.042) 0.561(0.048) 
AR(0.3) 0.697(0.064) 0.601(0.058) 0.633(0.037) 
AR(0.8) 0.838(0.081) 0.679(0.093) 0.719(0.042) 
Band(0.3) 0.713(0.079) 0.608(0.080) 0.668(0.039) 
Band(0.6) 0.754(0.085) 0.648(0.102) 0.651(0.035) 
CS(0.2) 0.668(0.059) 0.568(0.065) 0.611(0.041) 
Categorical 
0 AR(0) 0.675(0.045) 0.645(0.040) 0.643(0.038) 
AR(0.3) 0.784(0.057) 0.769(0.067) 0.758(0.042) 
AR(0.8) 0.909(0.058) 0.826(0.052) 0.799(0.051) 
Band(0.3) 0.799(0.058) 0.776(0.065) 0.774(0.034) 
Band(0.6) 0.847(0.063) 0.827(0.062) 0.688(0.039) 
CS(0.2) 0.719(0.064) 0.634(0.041) 0.677(0.049) 
0.1 AR(O) 0.654(0.052) 0.596(0.060) 0.604(0.037) 
AR(0.3) 0.748(0.063) 0.683(0.093) 0.695(0.052) 
AR(0.8) 0.869(0.085) 0.764(0.086) 0.772(0.041) 
Band(0.3) 0.772(0.071) 0.712(0.093) 0.733(0.039) 
Band(0.6) 0.806(0.067) 0.736(0.099) 0.729(0.048) 
CS(0.2) 0.684(0.058) 0.595(0.051) 0.638(0.034) 
0.3 0.614(0.056) 0.557(0.049) 0.571(0.052) 
0.697(0.065) 0.614(0.068) 0.632(0.047) 
0.824(0.092) 0.694(0.103) 0.727(0.045) 
Band(0.3) 0.720(0.071) 0.639(0.085) 0.655(0.036) 
Band(0.6) 0.749(0.087) 0.644(0.090) 0.648(0.045) 
CS(0.2) 0.666(0.056) 0.574(0.050) 0.629(0.038) 
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main G effects as well as interactions, the number of unknown parameters is much 
larger than the sample size. 

Detailed estimation results are presented in Table 3 for the proposed method 
and Tables 8 and 9 in Appendix for the two alternatives. It is observed that the 
three methods lead to quite different findings. Specifically, the proposed and quantile 
methods share four common main G effects and four interactions. Otherwise, there 
is no overlap in identification. The "signals" in practical data can be weaker than 
those in simulated data, leading to the significant differences across methods. 

With the proposed method, sixteen genes are identified to have interactions with 
either age or smoking status. As for many other cancer types, age has been identified 
as a critical factor in lung cancer prognosis. Smoking has been confirmed as the 
most important E factor for lung cancer risk and prognosis. In the literature, G-E 
interaction analysis for lung cancer prognosis is still very limited. However, there 
have been many studies on the functionalities of genes. Searching such studies can 
provide a partial support to the validity of our analysis results. Among the identified 
genes, many have been implicated in cancer in the literature. Specifically, the AGPAT 
family, which includes AGPAT6 as a member, has been found to play a role in 
multiple cancer types. For example, AGPAT2 and AGPAT11 have been found to be 
upregulated in ovarian, breast, cervical, and colorectal cancers (Agarwal and Garg, 
2010). Another gene that is worth attention is ATF6, which acts both as a sensor and 
a transcription factor during endoplasmic reticulum stress. ATF6o has been found 
to promote hepatocarcinogenesis and cancer cell proliferation through activating 
downstream target gene BIP. Its efficiency of stress recognition and signaling has been 
found to decrease with age (Naidoo, 2009). We find that gene COLCA2 (colorectal 
cancer associated 2) interacts with smoking pack years. Studies have shown that 
COLCA2 may have critical functions in suppressing tumor formation in epithelial 
cells (Peltekova et al., 2014). We also identify an interaction between NOS1AP and 
age. It has been found that the protein complex of SCRIB, NOSIAP, and VANGL 1 
regulates cell polarity and migration, and this complex can be associated with cancer 
progression (Anastas et al., 2012). An interaction between PPPIRIS5B and smoking 
pack years has also been identified. It has been suggested that PPP1R15B is likely to 
be regulated by Nrf2, which has a protective response to smoking induced oxidative 
stress in the lung (Taylor et al., 2008). Also, PPPIRI5B may promote cancer cell 
proliferation. 

To complement the identification and estimation analysis, we also evaluate sta- 
bility. Specifically, we randomly select 3/4 of the subjects and apply the proposed 
method and alternatives. This procedure is repeated 200 times. We then compute the 
probability that an interaction is identified. Similar procedures have been extensively 
adopted in published studies. The stability results are also provided in Tables 3, 8, 
and 9. We see that most of the identified interactions are relatively stable, with many 
having the probabilities of being identified close to one. 
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Table 3 Analysis of the TCGA lung adenocarcinoma data using the proposed method. The iden- 
tified interactions are denoted as "gene * environmental variable". For the interactions, values in 
“()” are the stability results 


Effect Estimate x 100 
Age 8.817 
Smoking pack years —0.358 
AGPAT6 55.343 
ANKRD46 4.646 

ATF6 40.350 
C1ORF27 7.708 
COLCA2 —1.808 
CANDI 32.138 
DNAJC21 6.652 
DYRK2 —24.595 
HERPUD2 —40.358 
LCMT2 40.151 
NOS1AP -28.707 
PIGZ -19.058 
PPP1R15B -2.411 
TROVE2 -5.979 

WIPI2 —18.739 
YTHDF3 21.524 
AGPAT6 * age 0.202(0.995) 
ANKRD46 * age 0.546(0.537) 
ATF6 * age 0.493(0.193) 
CIORFE27 * age —0.072(0.989) 
COLCA2 * smoking pack years 0.315(0.993) 
CANDI * smoking pack years —0.716(0.649) 
DNAJC21 * age —0.280(1.000) 
DYRK2 * age 0.496(0.330) 
HERPUD2 * age —0.393(0.975) 
LCMT2 * age —0.222(0.927) 
NOS1AP * age 0.140(0.734) 
PIGZ * age 0.046(0.397) 
PPPIRI5B * smoking pack years 0.711(0.998) 
TROVE2 * age —0.416(0.890) 
WIPI2 * age —0.205(1.000) 


YTHDF3 * age —0.201(0.839) 
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Table 4 Simulation: identification of main G effects. In each cell, mean AUC (se). There are a 
total of 10 nonzero main effects, with coefficients uniform (0.7, 1.3) 


E Cor Robust Nonrobust Quantile 

Continuous 

0 AR(0) 0.94(0.055) 0.964(0.052) 0.867(0.041) 
AR(0.3) 0.985(0.019) 0.998(0.002) 0.882(0.052) 
AR(0.8) 0.987(0.019) 0.994(0.019) 0.912(0.032) 
BAND(0.3) 0.985(0.020) 0.999(0.003) 0.841(0.046) 
BAND(0.6) 0.985(0.022) 0.998(0.007) 0.792(0.047) 
CS(0.2) 0.938(0.047) 0.923(0.061) 0.863(0.039) 

0.1 AR(0) 0.883(0.079) 0.834(0.135) 0.852(0.044) 
AR(0.3) 0.975(0.028) 0.956(0.108) 0.841(0.053) 
AR(0.8) 0.981(0.052) 0.922(0.127) 0.891(0.034) 
BAND(0.3) 0.967(0.075) 0.942(0.136) 0.836(0.036) 
BAND(0.6) 0.975(0.036) 0.944(0.128) 0.789(0.044) 
CS(0.2) 0.921(0.079) 0.851(0.126) 0.855(0.031) 

0.3 AR(0) 0.837(0.102) 0.716(0.127) 0.792(0.028) 
AR(0.3) 0.942(0.082) 0.829(0.168) 0.814(0.031) 
AR(0.8) 0.970(0.070) 0.841(0.135) 0.855(0.042) 
BAND(0.3) 0.943(0.079) 0.836(0.158) 0.821(0.028) 
BAND(0.6) 0.954(0.067) 0.871(0.144) 0.811(0.053) 
CS(0.2) 0.886(0.095) 0.704(0.142) 0.824(0.058) 

Categorical 

0 AR(0) 0.931(0.047) 0.956(0.050) 0.857(0.044) 
AR(0.3) 0.970(0.030) 0.998(0.010) 0.872(0.058) 
AR(0.8) 0.976(0.028) 0.992(0.023) 0.883(0.044) 
BAND(0.3) 0.974(0.027) 0.999(0.002) 0.832(0.041) 
BAND(0.6) 0.974(0.029) 0.999(0.010) 0.824(0.036) 
CS(0.2) 0.937(0.051) 0.918(0.065) 0.844(0.051) 

0.1 AR(0) 0.894(0.078) 0.868(0.137) 0.853(0.039) 
AR(0.3) 0.963(0.036) 0.959(0.104) 0.875(0.056) 
AR(0.8) 0.975(0.029) 0.939(0.090) 0.897(0.064) 
BAND(0.3) 0.968(0.036) 0.960(0.090) 0.841(0.042) 
BAND(0.6) 0.971(0.033) 0.934(0.117) 0.829(0.048) 
CS(0.2) 0.923(0.075) 0.824(0.138) 0.846(0.057) 

0.3 0.838(0.098) 0.707(0.153) 0.788(0.029) 


0.932(0.075) 


0.808(0.159) 


0.818(0.035) 


0.967(0.041) 


0.863(0.143) 


0.854(0.048) 


BAND(0.3) 0.931(0.101) 0.842(0.173) 0.819(0.035) 
BAND(0.6) 0.946(0.065) 0.821(0.155) 0.813(0.044) 
CS(0.2) 0.854(0.106) 0.704(0.136) 0.828(0.052) 
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Table 5 Simulation: identification of G-E interactions. In each cell, mean AUC (se). There are a 
total of 20 nonzero interactions, with coefficients ~uniform (0.7, 1.3) 


E Cor Robust Nonrobust Quantile 

Continuous 

0 AR(0) 0.866(0.088) 0.784(0.137) 0.761(0.047) 
AR(0.3) 0.963(0.075) 0.873(0.143) 0.776(0.062) 
AR(0.8) 0.976(0.061) 0.888(0.098) 0.862(0.066) 
BAND(0.3) 0.964(0.085) 0.861(0.159) 0.803(0.058) 
BAND(0.6) 0.973(0.066) 0.895(0.117) 0.659(0.043) 
CS(0.2) 0.904(0.092) 0.775(0.134) 0.848(0.039) 

0.1 AR(0) 0.792(0.090) 0.684(0.124) 0.734(0.053) 
AR(0.3) 0.938(0.082) 0.811(0.158) 0.786(0.048) 
AR(0.8) 0.962(0.075) 0.801(0.128) 0.864(0.039) 
BAND(0.3) 0.932(0.112) 0.806(0.165) 0.775(0.041) 
BAND(0.6) 0.948(0.080) 0.826(0.150) 0.633(0.052) 
CS(0.2) 0.881(0.105) 0.740(0.130) 0.829(0.058) 

0.3 AR(0) 0.733(0.094) 0.612(0.095) 0.753(0.046) 
AR(0.3) 0.862(0.122) 0.701(0.152) 0.749(0.052) 
AR(0.8) 0.927(0.089) 0.710(0.128) 0.861(0.057) 
BAND(0.3) 0.879(0.110) 0.725(0.146) 0.748(0.033) 
BAND(0.6) 0.907(0.089) 0.744(0.139) 0.598(0.062) 
CS(0.2) 0.820(0.105) 0.637(0.114) 0.841(0.045) 

Categorical 

0 AR(O) 0.866(0.086) 0.782(0.130) 0.733(0.064) 
AR(0.3) 0.955(0.079) 0.869(0. 140) 0.728(0.051) 
AR(0.8) 0.971(0.061) 0.881(0.100) 0.802(0.039) 
BAND(0.3) 0.967(0.061) 0.898(0.119) 0.749(0.048) 
BAND(0.6) 0.969(0.058) 0.888(0.119) 0.609(0.051) 
CS(0.2) 0.900(0.109) 0.763(0.134) 0.801(0.039) 

0.1 AR(0) 0.801(0.102) 0.702(0.125) 0.667(0.057) 
AR(0.3) 0.932(0.077) 0.806(0.159) 0.702(0.048) 
AR(0.8) 0.964(0.056) 0.830(0.123) 0.764(0.055) 
BAND(0.3) 0.942(0.074) 0.814(0.151) 0.689(0.053) 
BAND(0.6) 0.959(0.064) 0.826(0.148) 0.604(0.045) 
CS(0.2) 0.875(0.104) 0.713(0.123) 0.751(0.048) 

0.3 0.734(0.099) 0.601(0.114) 0.681(0.039) 


0.873(0.104) 


0.681(0.131) 


0.687(0.058) 


0.931(0.075) 


0.746(0.120) 


0.768(0.048) 


BAND(0.3) 0.877(0.132) 0.704(0.157) 0.699(0.059) 
BAND(0.6) 0.905(0.101) 0.718(0.146) 0.547(0.067) 
CS(0.2) 0.800(0.124) 0.635(0.115) 0.724(0.055) 
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Table 6 Simulation: identification of main G effects. In each cell, mean AUC (se). There are a 
total of 20 nonzero main effects, with coefficients uniform (0.4, 0.6) 


E Cor Robust Nonrobust Quantile 

Continuous 

0 AR(0) 0.735(0.065) 0.738(0.057) 0.684(0.042) 
AR(0.3) 0.885(0.049) 0.894(0.046) 0.798(0.038) 
AR(0.8) 0.961(0.045) 0.925(0.036) 0.811(0.048) 
BAND(0.3) 0.896(0.046) 0.894(0.047) 0.809(0.044) 
BAND(0.6) 0.916(0.050) 0.921(0.047) 0.792(0.039) 
CS(0.2) 0.769(0.065) 0.710(0.059) 0.753(0.049) 

0.1 AR(0) 0.701(0.064) 0.682(0.071) 0.678(0.033) 
AR(0.3) 0.806(0.079) 0.754(0.112) 0.794(0.049) 
AR(0.8) 0.947(0.046) 0.871(0.088) 0.806(0.053) 
BAND(0.3) 0.865(0.071) 0.809(0.132) 0.801(0.036) 
BAND(0.6) 0.886(0.064) 0.860(0.086) 0.789(0.055) 
CS(0.2) 0.738(0.080) 0.659(0.084) 0.784(0.042) 

0.3 AR(0) 0.646(0.073) 0.577(0.067) 0.632(0.044) 
AR(0.3) 0.774(0.090) 0.664(0.088) 0.672(0.052) 
AR(0.8) 0.896(0.081) 0.743(0.127) 0.755(0.041) 
BAND(0.3) 0.782(0.106) 0.664(0.112) 0.711(0.065) 
BAND(0.6) 0.823(0.104) 0.719(0.141) 0.705(0.053) 
CS(0.2) 0.708(0.078) 0.601(0.093) 0.645(0.051) 

Categorical 

0 AR(O) 0.736(0.068) 0.729(0.054) 0.679(0.045) 
AR(0.3) 0.858(0.066) 0.901(0.059) 0.782(0.052) 
AR(0.8) 0.944(0.055) 0.922(0.046) 0.797(0.041) 
BAND(0.3) 0.869(0.062) 0.900(0.057) 0.787(0.058) 
BAND(0.6) 0.894(0.062) 0.920(0.045) 0.762(0.048) 
CS(0.2) 0.768(0.067) 0.696(0.055) 0.763(0.048) 

0.1 AR(0) 0.716(0.074) 0.659(0.096) 0.669(0.051) 
AR(0.3) 0.826(0.077) 0.777(0.132) 0.752(0.039) 
AR(0.8) 0.914(0.089) 0.837(0.103) 0.786(0.054) 
BAND(0.3) 0.838(0.079) 0.805(0.120) 0.743(0.062) 
BAND(0.6) 0.867(0.069) 0.828(0.118) 0.721(0.039) 
CS(0.2) 0.723(0.070) 0.642(0.080) 0.678(0.042) 

0.3 0.639(0.074) 0.588(0.074) 0.613(0.045) 


0.758(0.083) 


0.684(0.097) 


0.658(0.047) 


0.877(0.109) 


0.764(0.138) 


0.743(0.055) 


BAND(0.3) 0.789(0.089) 0.703(0.115) 0.688(0.044) 
BAND(0.6) 0.806(0.104) 0.702(0.121) 0.671(0.049) 
CS(0.2) 0.694(0.074) 0.599(0.070) 0.648(0.037) 
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Table 7 Simulation: identification of G-E interactions. In each cell, mean AUC (se). There are a 
total of 40 nonzero interactions, with coefficients ~uniform (0.4, 0.6) 


E Cor Robust Nonrobust Quantile 
Continuous 
0 AR(0) 0.647(0.048) 0.605(0.053) 0.606(0.049) 
AR(0.3) 0.755(0.059) 0.707 (0.079) 0.752(0.058) 
AR(0.8) 0.892(0.074) 0.781(0.073) 0.814(0.039) 
BAND(0.3) 0.791(0.074) 0.726(0.093) 0.769(0.052) 
BAND(0.6) 0.838(0.066) 0.765(0.084) 0.629(0.048) 
CS(0.2) 0.686(0.077) 0.609(0.055) 0.602(0.048) 
0.1 AR(0) 0.623(0.041) 0.593(0.049) 0.563(0.059) 
AR(0.3) 0.701(0.074) 0.625(0.085) 0.602(0.055) 
AR(0.8) 0.863(0.060) 0.734(0.088) 0.778(0.041) 
BAND(0.3) 0.750(0.070) 0.661(0.099) 0.711(0.058) 
BAND(0.6) 0.795(0.069) 0.722(0.089) 0.725(0.061) 
CS(0.2) 0.662(0.059) 0.586(0.052) 0.596(0.062) 
0.3 AR(0) 0.581(0.050) 0.537(0.037) 0.503(0.055) 
AR(0.3) 0.656(0.064) 0.570(0.051) 0.596(0.049) 
AR(0.8) 0.807(0.086) 0.647(0.085) 0.679(0.058) 
BAND(0.3) 0.677(0.077) 0.580(0.072) 0.618(0.061) 
BAND(0.6) 0.718(0.082) 0.612(0.088) 0.604(0.042) 
CS(0.2) 0.642(0.060) 0.550(0.055) 0.571(0.046) 
Categorical 
0 AR(O) 0.640(0.051) 0.604(0.052) 0.611(0.052) 
AR(0.3) 0.743(0.067) 0.706(0.089) 0.733(0.041) 
AR(0.8) 0.887(0.070) 0.779(0.070) 0.806(0.034) 
BAND(0.3) 0.761(0.069) 0.716(0.089) 0.751(0.039) 
BAND(0.6) 0.820(0.076) 0.78 1(0.087) 0.623(0.059) 
CS(0.2) 0.688(0.071) 0.598(0.051) 0.601(0.033) 
0.1 AR(0) 0.619(0.051) 0.565(0.050) 0.548(0.047) 
AR(0.3) 0.706(0.068) 0.637(0.091) 0.647(0.054) 
AR(0.8) 0.842(0.092) 0.728(0.089) 0.751(0.061) 
BAND(0.3) 0.735(0.077) 0.667(0.092) 0.726(0.041) 
BAND(0.6) 0.771(0.078) 0.691(0.102) 0.736(0.034) 
CS(0.2) 0.658(0.061) 0.568(0.046) 0.589(0.052) 
0.3 0.597(0.053) 0.540(0.043) 0.540(0.062) 
0.663(0.070) 0.579(0.066) 0.604(0.038) 
0.794(0.091) 0.658(0.094) 0.704(0.058) 
BAND(0.3) 0.682(0.073) 0.606(0.080) 0.626(0.046) 
BAND(0.6) 0.717(0.086) 0.615(0.084) 0.614(0.047) 
CS(0.2) 0.646(0.060) 0.558(0.048) 0.604(0.055) 
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Table8 Analysis of the TCGA lung adenocarcinoma data using the nonrobust method. The iden- 
tified interactions are denoted as “gene * environmental variable". For the interactions, values in 


“()” are the stability results 


Effect Estimate x 100 
Age 0.868 
Gender 6.683 
Smoking pack years 0.041 
Smoking history —20.163 
SPATA33 —1.060 
DNAJC21 5.237 
EIFAEBPI 8.736 
FAM160B1 —0.030 
KIAA1586 4.018 
LRRC37A4P 3.040 
ST6GALNACI 5.989 
TM2D2 10.110 
TMEM192 -5.785 
TROVE2 -3.019 

WIPI2 5.296 
SPATA33 * smoking pack years —0.084(0.812) 
DNAJC21 * smoking history 5.245(0.986) 
EIF4EBP1 * smoking pack years —0.087(0.977) 


FAMI60BI * gender 
KIAA1586 * age 


11.844(0.982) 
0.107(0.954) 


LRRC37A4P * smoking pack years 


—0.205(0.998) 


LRRC37A4P * smoking history 


—6.799(0.998) 


ST6GALNACI * smoking pack years 


—0.149(0.989) 


TM2D2 * smoking pack years 


—0.176(0.998) 


TMEM192 * gender 9.853(0.995) 
TROVE2 * gender 7.349(0.929) 
WIPI2 * smoking history 13.420(0.995) 


5 Discussions 


To understand the prognosis of complex diseases, it is essential to study G-E interac- 
tions. In “classic” low-dimensional biomedical studies, data contamination is found 
to be not rare, and it has been suggested that robust methods are needed to accommo- 
date contamination. This study has developed a robust method for high-dimensional 
genetic interaction analysis, which is still limited in the literature. The proposed 
method consists of a novel robust loss function and a penalized identification strat- 
egy that respects the “main effects, interactions" hierarchy, both of which have novel 
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Table9 Analysis of the TCGA lung adenocarcinoma data using the quantile method. The identified 
interactions are denoted as “gene * environmental variable". For the interactions, values in “()” are 
the stability results 


Effect Estimate x 100 
Age 0.891 
ATP6VICI 0.252 
CIORF27 7.321 

SDE2 0.337 

CD46 0.584 
DNAJC21 1.272 
KLHL7 0.932 

PTK2 9.426 

PVTI 1.148 
RAB3GAP2 0.845 
TSPAN3 8.557 
TWISTNB 0.872 
WDR26 1.265 

WIPI2 7.227 
YWHAZ 1.883 
ATP6V1C1 * age 0.0172(0.724) 
CIORF27 * age 0.295(0.899) 
SDE2 * age 0.0153(0.758) 
CD46 * age 0.0344(0.862) 
DNAJC21 * age 0.327(0.791) 
KLHL7 * age 0.372(0.514) 
PTK2 * age 1.074(0.927) 
PVTI * age 0.876(0.711) 
RAB3GAP2 * age 0.923(0.757) 
TSPANG * age 1.388(0.942) 
TWISTNB * age 0.915(0.812) 
WDR26 * age 1.279(0.798) 
WIPI2 * age 1.891(0.906) 
YWHAZ * age 1.596(0.796) 


advancements. Also significantly advancing from the literature, we have rigorously 
established the consistency properties. The theoretical results may seem "familiar", 
which is “comforting” in that the consistency properties are not sacrificed with the 
additional robustness, high dimensionality, and interactions. It is worth noting that 
the consistency results do not demand excessive assumptions on the error distribu- 
tion, which are usually needed in the existing literature. In simulation, the proposed 
method outperforms the nonrobust alternative. It is interesting to note that it has 
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superior performance when there is no contamination. Another important finding 
is that it also outperforms the quantile-based robust method. Most of the existing 
high-dimensional robust studies have adopted the quantile regression technique. Our 
simulation suggests that it is prudent to develop alternative robust methods. In the 
analysis of TCGA lung cancer data, the proposed method generates results with some 
overlappings with the quantile regression method, however, none with the nonrobust 
method. The identified genes have important implications, and the identified inter- 
actions are stable. 

The proposed study can be potentially extended in multiple directions. In survival 
analysis, there are many other models beyond the AFT. It can be of interest to develop 
robust methods based on other models. We have studied G-E interactions. It can be 
of interest to extend to G-G interactions. In theoretical analysis, one problem left is 
the breakdown point. Because of the extremely high complexity, this problem has 
been left uninvestigated in many other robust studies too. In our simulation, we have 
experimented with contamination rate as high as 3096, which is much higher than 
many of the existing studies. The superiority of the proposed method over the quantile 
regression method is observed. The relative efficiency of different robust methods, 
although of interest, will be postponed to future studies. In data analysis, the proposed 
method identifies a different set of main effects and interactions. Mining the literature 
and the stability evaluation can support the validity of findings to a certain extent. 
More validations need to be pursued in the future. 


Appendix 


Proof of Theorem 1 


Proof Define the oracle estimator € with Tus — 0 and 


n 


Ça = arg max ) o; exp(- Oi — uj, C)? /0). (4) 


i=1 


Recall that the proposed objective function is 


p p qti 
Laaa C) = Qe) — D> odbl: 1.5) — 3 YS o(bigl Ao s). 6) 
k=1 k=1 j=2 


In what follows, we first establish the estimation consistency of a in Step 1, and then 
show that ¢ is a local maximizer of L;, 5, o(Z) is Step 2. 
Step 1. Define the objective function 


Ra (Ca) = 3 o; exp Qi — Wey C) /0). 


i=1 
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Then £j; = arg max R, (ta). Let r, = y| |/n. To prove lt — £l = Opa), 
it suffices to show that for any given 7 > 0, there exists a sufficiently large constant 
C > 0, 


Pr ( sup R,(Ly) < RG2)) Eqs (6) 
ty EF 


where ./ = Ex DE — Cla = Cra }. This implies that R, (fv) has a local max- 
imizer ¢y that satisfies || — ¢%, ll, = Op). 
Recall the definitions of D, (Z) and I, (¢). By Taylor's expansion, we have 


Ruler) = Rally) = Y os [exo 0i — uu Gr 6) = exp 0i = Wy Y 10) 


i=l 
—D, of (7) bar — Uy) 
1 x 
tas = GD Ine Oa — y) 
=Q01 + Q2, (7) 
where £ lies between ¢* and ¢. By C3 and C4, we have that for all j € {1,---, p + 


q+pq+1} and any given f, Pr(|D, ;(£*)| >t) € 2exp (—nt?/o). Then 
E(|./nDn,j(6*)|) < K < oo for all j. With Markov's inequality, 


Pr(ID, y (67) Ilo > t) < Ellan Dn v (COIN) x L| K /(nt’). 


By the Cauchy-Schwarz inequality, Q1 < C|| D, ø (£*)|ar,. Lett = Cp,r,/3, where 
px is the smallest eigenvalue of — /44 (£4). From CS, we have that p. is bounded away 
from zero and infinity. Then we have 


1 9K 
Pr(Qi 3 FAC) S 1— os (8) 
For Q5, we have 
202 = (y — 64)" Lys (Cu — Uy) 

+ (ba — Oy)" Maa (£) — Let ot (6")} Go — 6%) 

+ (by — 65)" (Inv (6) Laa (6)} Gv — 0) 
—Q» + Q2 + Qz. (9) 

Since Amax ww (£*)) € — p, by C5, we have 

Q5 x —p,C’r,. (10) 


Under C4, we have 
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z 1 
Qn < «li — t*laCr; < kChrj < cC purs. (UD) 


The second inequality holds since ¢ lies between ¢* and £, which yields ||z — ¢*||2 < 
Cr,. When n is sufficiently large, the last inequality holds. With C4 and Bonferroni's 
inequality, 


Pr(llL uu (6) — Ly Olip = 02/9) € 2127]? exp (7np2/o?) , 


where || -||r denotes the Frobenius norm. By the inequality Amax Un i s (£) — 
Ig of (E)) € Hn GC) — Iv (£"*)lr. we have 


Q3 < jac with probability at least 1 — 2| |? exp (7nolja?) . (12) 
Combining (9), (10), (11), and (12), we have 
Pr(Q5 < -je Cr) > 1 — 2|% |° exp (-no2/o?).. (13) 
With (7), (8), and (13), we have 
Ra (Car) — Ri) < -žp0’r <0 (14) 


with probability at least 


9K 
[= CC 2|. |? exp (-no?/o?) . 


* 
Note that o. is bounded away from zero and infinity in C5. As n — oo, the above 


probability is bigger than 1 — 5 . Let C = 4p,  /K /1, then we can conclude (6). 


Step 2. Next we show that the oracle estimator c studied in Step 1 satisfies the Karush- 
Kuhn-Tucher (KKT) condition, and then t is a local maximizer of L;, ;, 9 (7). Based 
on the results in Step 1 and C6, we only need to check the following conditions 


|| Dio, < a1, [D OD <42 Dna Ol Xx» (15) 


hold with asymptotic probability one, where ||v||;; = max; |vi| for any vector v = 
(vi, +++ , Yee). Applying Taylor's expansion, 


Da (CY = Dat + Loa) = th), (16) 


where t lies between ¢* and t. From (4) and the proof of Theorem 1(a), we have 


a~ 


ta — th = In a E) Dy) (17) 
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where ¢ lies between ¢* and t, which is defined in Step 1. By substituting (17) 
into (16), 


Dy,@, ©) = Dig (5) — Dog E)n, a EN Dii 7) (18) 
Here we define 
Ag, = Dia (E) — Ig G7) Ly y (7) Dn, ap (£7). 
Inspired by the deduction of Q» in Step 1, we can establish that 
Pr(I| Dy, Glo > X1) x Pr(llA; a llo > A1)- 


That is, we only need to focus on || A7 g, llo in order to evaluate the probability of 
(0D, 2, Ol < A4) in (15). Note that, 


AF aloo < IDn, Eloo + Mgr G7) Lr (7) Dn, (Moo 
< Da Vloo + Waar (C) Lor 6") oll Dolo. — (19) 


Recall that 9, = || Ig, (£*) Lo s (£*) | lloc. If 


ID. Gs < —— 
n < , 
t lloc i+ 6, 
along with (19), we have | Ar a, llo < A1. Similarly, we also need 
A2 Ar +A2 
D, (¢* T. and || D, (¢* 
| Clo < TT, and || D, (¢ Jo < T$, 


to satisfy the other two conditions in (15), where ®2 = ||, (C) Lo s (£*)-! llo; 
and $5 = |Iz,z (C La y (£*)-1||;5. Based on the above discussions, we have 


Ay Ado 
1 + max? 4 ®, 


Aza A122 


At 
D,(c* ? i 
Il n(¢ Jlo * 1+ 4, 1+ $5 1+ 3 


< min{ 


}. 


We now derive the probability bound for the above event. By Bonferroni’s inequality 
and C4, we can obtain 


At Ad2 nay A dp)" 
Pr } || D,(Z* < >1-2 + p+qt+l)ex : 
IDs G7) loo ma m [^ (p4+p+q+1) T (+ max, D)o? 


Combining the results in Steps | and 2, we conclude that T is a local maximizer 
of L;, 1,9 (£) with probability at least 
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i-dÓ n(A ^ A2)? 
— ex ] 
PSP (1 + max? , $,)?o? 


and satisfies |Z. — ¿¥ l2 = Op(/|W1/n), Core = 0. With C6, log p = O(nw?), 
and w, = (A1 ^ A2)/{max(®), 2, &3)}, this tail probability is exponentially small. 
The theorem is thus proved. 
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A Novel Approach for Improving A) 
Accuracy for Distributed Storage giecik 
Networks 


Liu Lu, Ke Yuanyuan, and Yuan Yong 


Abstract With the development of storage technology and Internet technology, 
cloud storage continues to make its impact. Scalability, reliability, and lowered costs 
have made cloud storage widely used with success in businesses and individuals. 
The advent of the blockchain has brought some changes. As the incentive layer for 
IPFS, Filecoin allows storage resources to become tradable, greatly extending storage 
capacity. However, the process of testing the integrity of data still needs constant 
improvement. In this chapter, we propose a new data audit proof, in which nodes 
continuously upload hashed data that has been added to random numbers, and the 
smart contract will compare the result to verify the integrity of the data. Meanwhile, 
data owner could calculate and then challenge to verify the data integrity. There are 
audit miners responsible for regulating the behavior of miners and the protection of 
users’ data, and audit miners in a state of semi-participation. It is demonstrated later 
in the chapter that this proof is accurate enough and resistant to attacks. 


Keywords Distributed storage networks * Cloud storage * Blockchain 


1 Introduction 


Storage technology has evolved rapidly over the last few decades, with continuously 
decreasing hard disk prices and ever faster data speeds. However, the rapid growth of 
the online economy and big data technology has caused the need for data storage to 
expand exponentially, leading to the idea of cloud storage, in which data will be stored 
on cloud servers provided by third parties, and thus users can access data in a timely 
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and more convenient manner. Typically, cloud storage providers use technologies 
such as distributed storage (Mattson et al., 1970) to significantly reduce storage costs. 
However, the centralized storage makes cloud storage providers vulnerable to single 
points of failure and can create risks such as overstepping provider privileges and 
causing information leakage. Efficient centralized storage provisioning will remain 
mainstream in the future, but there is also an emerging and urgent need to meet users' 
needs for information security. 

Traditional cloud storage providers, such as Amazon and Google, build cloud 
storage architectures with vast resources, using distributed storage technology to 
serve billions of users. Distributed storage means that data are spread across mul- 
tiple storage servers and these scattered storage resources form a virtual storage 
device, effectively storing data in various places across the provider. The benefits 
of distributed storage are increased system reliability, availability, and access effi- 
ciency, as well as improved scalability. But the disadvantages are also obvious. We 
store our data on Google Cloud Drive on the basis that we trust Google to protect our 
data from being tampered with or lost, which can also lead to other disadvantages. 
The central server is vulnerable to attacks from adversaries, and internal failures 
and malpractice can also lead to data loss. As such, the security of cloud storage 
has also been a focus of attention in recent years. Traditional symmetric encryption 
algorithms put the keys on a central server, which makes it easier for attackers to 
get these keys and thus reduces the security of information. Moreover, data integrity 
verification whether data are stored efficiently and without deletion is also a crucial 
part of cloud storage services. Literature (Priyadharshini, 2012) summarizes the data 
integrity verification of traditional cloud storage, which is performed by TPA (Third 
Party Auditor) between the user and the CSP (Cloud Service Provider) to validate 
the data. The user poses a challenge to verify the integrity of their cloud data, and 
the TPA responds by comparing the original data, or the hash value of it according 
to literature (Zikratov et al., 2017), to verify the integrity. However, inefficiencies 
and tripartite or joint evil behavior can make opaque audit proofs unreliable. The 
convenience of centralized services brings with it the corresponding pitfalls. With 
the emergence of Bitcoin, decentralized technology continues to be improved, and 
decentralized storage brings an important addition to the traditional storage market. 

The idea of providing decentralized storage has become popular with the rise 
of blockchain technology, and their combination could be considered a perfect fit. 
Blockchain enables reaching the consensus among decentralized, untrusted nodes. 
Its development has facilitated intensive research in several technologies such as 
cryptography, data structures, and consensus algorithms. When data are stored in 
multiple copies on the hard drives of different nodes, we cannot guarantee that all 
nodes are trustworthy. How to ensure the security and integrity of the data is a very 
crucial issue. After ensuring the stability of the storage, we also need to consider how 
to motivate people to become nodes and provide their own storage capacity, which 
requires a reasonable incentive mechanism. 

Much of the current research is focused on issues such as access control, 
integrity verification, data retrieval, and traceability. Many platforms that 
offer distributed storage have already been launched. For example, the Sia 
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(Vorick & Champine, 2023) storage system, which was online earlier, has been 
unable to be developed effectively due to its less than optimal incentive design. IPFS 
(Benet, 2014), as a relatively complete platform, is a distributed storage system pro- 
tocol for distributing and storing resources of various data types. Filecoin (Protocol 
Labs, 2023), as its incentive layer, incentivizes storage miners and retrieval miners to 
complete their own work by issuing tokens. Taking Filecoin as an example, there are 
three roles in Filecoin: client, storage miner, and retrieval miner. Clients pay for the 
service of storing and retrieving data. They can choose from a selection of available 
service providers. If they want to store private data, they need to encrypt it before 
submitting it to the service provider. Storage miners store clients' data for a reward. 
They decide for themselves how much space to provide for storage. After the client 
and the storage miner have reached an agreement, the miner is obliged to provide 
proof of their stored data on an ongoing basis. Everyone can view this proof and 
make sure that the storage miner is reliable. Retrieval miners give data to customers 
upon their request. They can retrieve data from clients or storage miners. Retrieval 
miners and clients use small payments to exchange data and tokens. The data are 
fragmented and the client pays a small amount of tokens for each fragment. Retrieval 
miners can also act as storage miners at the same time. 

We will now show how a decentralized storage network stores and audits. As 
shown in Fig. 1, we demonstrate a cloud service with blockchain participation in two 
aspects: storage and audit. Data owners upload their data to miners on the server, 
who store the data and record the transactions on the blockchain. The blockchain also 
verifies data owner's information and protects the user's privacy. In order to ensure 
that their data are stored intact on the server, data owner challenges the TPA, which 
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sends a request to the system and verifies the data provided by the miner in response 
to data owner's challenge. The verified result is then recorded on the blockchain. So 
each block in the blockchain stores information such as the height of the block, the 
block header, information about the previous block, a timestamp, storage message, 
ID, and Auditing message. 

Data integrity verification in distributed storage, i.e., the red lines in Fig. 1, is our 
concern. Data integrity verification is the verification that data is stored intact in the 
storage space of each untrusted node. This affects the security of the data and is 
key to the availability of the storage service. Current data integrity validation can be 
divided into two kinds, one for traditional cloud-based data integrity validation and 
the other for blockchain-based data integrity validation. In turn, audit solutions using 
blockchain technology can be divided into whether or not TPA is involved. Most of 
these audit schemes verify against raw data and avoid dishonest behaviors such as 
delayed audits, sybil attack, and generation attack through consensus and incentive 
mechanisms. Blockchain-based data integrity verification can be used not only for 
auditing cloud data in cloud storage networks but also for different data scenarios 
to improve the security of the system. However, efficiency and accuracy cannot be 
achieved together in the process of decentralized data auditing. This will be described 
in Sect. 3. Most of the verification schemes that have worked better in current research 
do not run in public blockchains or require the participation of trusted central nodes. 
Instead, in fully decentralized blockchains, most are more efficient in order to ensure 
availability. However, the accuracy of verification cannot be fully guaranteed and 
the system is vulnerable to dishonest attacks. Our algorithm will improve long-term 
efficiency and stability in a fully decentralized blockchain with guaranteed accuracy. 

Our work is based on a modification of Filecoin for verifying the integrity of 
data in distributed storage. The audit miner in this algorithm is semi-involved and 
determines whether the data are kept intact by comparing the hash values of the 
data shards. If the result does not meet the desired goal, the audit miner will first 
ensure the integrity of the data and then find the storage miner that created the prob- 
lem, acting as a reasonable supervisor. In Sect. 2, we will summarize the past work 
on distributed audit algorithms and describe the characteristics of each platform. 
In Sect. 3, we will present our audit algorithm and analyze its advantages and the 
problems it solves. Sect. 4 will analyze the fault tolerance of this audit proof. 


2 Related Works 


2.1 Audit Research 


Ensuring data integrity in cloud computing has always been an important issue, 
and it is a guarantee that cloud computing can be widely used. Traditional data 
integrity verification can be divided into deterministic and probabilistic types. The 
dishonest behavior of TPA is also an important issue for audit algorithms when they 
are entrusted to perform audit integrity verification work. Blockchain technology 
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with a decentralized architecture no longer relies excessively on the honesty of third 
parties and reaching an overall consensus based on a reasonable consensus and 
incentive mechanism and then a mutual benefit for all parties is the core element to 
be explored at this stage. 

Literature (Zikratov et al., 2017) proposes a private blockchain called Zeppar, 
which determines the integrity of data by comparing the hash values of files. The 
use of cryptographic techniques to verify data integrity by comparing the original 
data is a common and applicable method. Such an approach is also used in literature 
(Wei et al., 2020), where smart contracts monitor data changes based on the unique 
hash value corresponding to the file generated by the Merkle Hash Tree (MHT). 
Verifying data integrity by constructing MHT is a relatively convenient method, e.g., 
in literature (Bai et al., 2018; Li et al., 2020). In literature (Li et al., 2020), data owner 
(DO) stores the verification tag of the data on the blockchain and verifies the data 
integrity by constructing MHT. After the blockchain network receives a request from 
the DO, it calculates the MHT root of the specified data, the CSP receives the DO's 
challenge and also calculates the corresponding MHT root, and the DO verifies the 
integrity of the data by comparing the two. We can find that neither the method of 
comparing file hash values nor the construction of MHT requires the involvement 
of TPA. Such an approach can be very efficient for verification but will compromise 
on the degree of centralization or be less fault-tolerant. It is relatively suitable for 
distributed storage systems where efficiency is required. 

In order to ensure the activity of the data, some auditing schemes use the provision 
of random numbers to avoid users falsifying the results of data validation in advance. 
Literature (Pinheiro et al., 2020) uses the user's data information to generate random 
challenges and uses the smart contract to audit the challenge-response information 
sent by the CSP. The audit scheme also assesses the trustworthiness of each CSP. 


2.2 Distributed Storage Project 


Sia: A relatively early decentralized storage platform, Sia in literature Vorick and 
Champine (2023) enables storage contracts to be formed between peer-to-peer 
nodes. The contracts are stored in the blockchain, making them publicly auditable. 
Sia divides files into 30 parts, encrypts each part using the Threefish algorithm, 
and distributes them to different nodes. Reed-Solomon erasure coding makes it 
possible to fully recover a file by requiring only any 10 of the 30 parts. With Merkle 
Tree (Ralph, 1988), nodes are required to upload storage proofs (Maxwell, 2023) 
within a certain time frame or be penalized. 

Filecoin: Literature (Benet, 2014) proposes a distributed peer-to-peer web proto- 
col: IPFS (InterPlanetary File System). Based on a content addressing protocol, 
it makes network transmission faster, content storage easier and nodes protection 
safer. Filecoin can be considered the incentive layer of the IPFS system, providing 
decentralized cloud storage in the form of tokens distributed in a rational way. 
Its audit algorithm Proof-of-Replication shown in literature (Protocol Labs, 2017) 
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deferred encoding of data to get a copy of the data and then generates a zero- 
knowledge proof to guarantee the correctness of the encoding process. Its other 
consensus algorithm, Proof-of-Spacetime, requires miners to periodically generate 
Merkle proofs for the replicas and submit them to the blockchain compressed with 
zero-knowledge proofs for tokens reward. Such an incentive encourages miners to 
store data correctly and to prove data liveness to obtain proof of work as a reward. 

e Arewave: Arweave cloud storage platform is similar to Filecoin in that it features 
a service that provides permanent storage. It has designed a new consensus algo- 
rithm, Proof of Access, which is based on the concept that new blocks require 
random validation of previous blocks. This turns the original blockchain into a 
network of blocks, where nodes no longer need to store exponentially growing 
amounts of data, but only certain data, allowing the data to be distributed evenly 
across the system to achieve distributed storage. 

e Storj: Storj Labs (2018) built at Kademlia is not a fully decentralized cloud storage 
system and it is dedicated to data storage durability and storage quality. Satellite 
nodes act as fully trusted nodes in storj for data management and data integrity 
review. The data are sliced after encryption and the data integrity is guaranteed by 
Proof of Retrievability (Juels et al., 2007) consensus algorithm. The satellite nodes 
are responsible for communication between the user and the storage node, for stor- 
ing metadata for the user, as well as auditing and enforcing Proof of Retrievability. 
The presence of the satellite nodes makes storj resistant to Byzantine attacks, but 
at the expense of the network's performance, resulting in poor scalability. 


Table 1 has given the difference among these four platforms. We can find that 
their audit proofs are different and lead to other differences in other natures. 

However, there are still some flaws. The current work almost verifies the integrity 
of distributed storage data under specific conditions, but none of it has a systematic 
analysis of the limitations of auditing. We will analyze the compromise factors that 


Table1 Distributed storage networks comparison 


Degree of Storage location | Consensus Audit proof 
decentralization algorithm 


Sia Off chain Proof of work Proof of storage 
(Maxwell, 2023) 


Filecoin Off chain Expected Proof of 
consensus replication, proof 
(Protocol Labs, _| of spacetime 
2017) 


Arweave Fully On chain Proof of work, Proof of access 
proof of access 


Storj Satellite nodes Off chain Proof of work, Proof of 

exist proof of stake retrievability 
(Juels et al., 
2007) 
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can arise from audit algorithms in distributed storage in the next section. We also 
analyze what requirements the Filecoin platform should have for auditing and what 
constraints it should have on storage miners. We design an audit proof for distributed 
storage and prove that it is sufficiently accurate and fault-tolerant. 


3 Audit Algorithm 


3.1 An Audit Framework 


In this chapter, we will reformulate the audit proof of Filecoin to address the current 
problems of Filecoin platform. Our goal is to retain the decentralized nature and 
allow the distributed storage network to complete the audit process on its own. Audit 
miners will only appear when necessary. This will ensure the accuracy of the audit 
and improve the efficiency of all nodes in reaching consensus on the audit results. 
We propose the audit impossibility proposition regarding the distributed storage 
networks as follows: 


Proposition 1 (Audit impossibility): The degree of decentralization, the accu- 
racy of audit results, and audit efficiency cannot be reached at the same time. 


When integrity checks are performed on an absolutely centralized storage server, 
CSP can invest significant resources in a way that increases the efficiency and accu- 
racy of the audit, as many cloud storage providers do nowadays. This is the approach 
that currently dominates the cloud storage market. However, with decentralization, 
we cannot perform fast and efficient integrity checks on untrustworthy storage nodes 
based on today's computing power and the sheer volume of data. How to balance 
accuracy and efficiency is currently the key issue for auditing in all distributed stor- 
age. For Filecoin, decentralization is its biggest advantage. However, too frequent 
data auditing not only affects the accuracy of the data audit results, but also causes 
the system to be less stable when the nodes are offline. Therefore, to improve the 
efficiency of auditing while ensuring the accuracy of the audit results is the issue 
considered in this chapter. 

Our design starts by slicing and numbering the data owners encrypted data using 
the shard technique and then generates multiple copies (k copies) by replication, 
which will be stored randomly on storage miners. When auditing these files, we will 
take the last 16 bits of the hash of the previous block as the new random number ./, 
which all miners will add to each of their stored shards for hashing. The result will 
need to be uploaded to the hash pool in a certain order with the miners’ signatures. 
All the hash values are automatically matched by the smart contract. By determining 
whether the corresponding hash value is equal to k, it is concluded that the data are 
stored intact in the distributed storage network. This allows a simple comparison of 
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the results to determine whether the data owners' data are completely stored across 
the network. The data owner can also, but not necessarily, add his/her own shard 
data to the random number .// and hash them. The result is then compared across 
the hash pool to determine if the data are stored correctly on the miners by finding 
the same k values in the network result. If the storage miner is not validly stored, the 
audit miner needs to find the problem miner quickly and back up the data in time 
(Fig. 2). 

There are three roles in our platform, data owners U, storage miners M, and 
audit miners A. Data owners upload encrypted data according to their needs and can 
challenge the integrity of the data. The storage miner stores the data sequentially as 
assigned by the smart contract as well as uploads proof of data integrity every once in 
a while. The audit miner is responsible for handling the distribution of data, as well 
as reviewing and supervising miners, protecting data integrity, regulating content, 
and assuming legal responsibility. The number of audit miners is limited and storage 
miners can be audit miners at the same time. Audit miners only appear if there are 
problems with the audit. 


3.2 Data Uploading 


From the moment the user uploads data, the user U; should divide his/her data D(i) 
into several shards by using slicing and encryption technology in order to keep the 
data secure. If he/she does not have enough computing power to handle too large 
data, he/she can upload them to audit miner A for slicing and encrypting and then 
pay some tokens. All the shards are then distributed by the audit miner to storage 
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miner M and back to user U;. Data slicing is a common technique used in distributed 
storage to protect the data. We use D(i, j) to denote the jth shard of U;'s data. The 
number of shards U; has, J;, will be determined by the size of U;'s data. Replication 
is also used to replicate k copies of D(i, j): D(i, j, k). Before uploading, the U; 
can add a random number N known only to him/her to calculate the result for all 
D(i, j) and encrypt the result as a validation audit option later. Next, the miner Ma 
is randomly sent a request to store the corresponding shard or not, with a specific 
request (Definition 1). M, that receives the request has to choose whether to store 
the data or not, depending on its storage capacity. The miner who confirms storage 
will store the corresponding D(i, j) on his local hard disk. The label (i, j) of the 
data D(i, j) will only be stored in the smart contract and will not be transmitted 
to the miner who stored it. The miner will not know the exact label (i, j) of the 
data he/she stores, but will only number them sequentially according to the order 
in which he/she stores D(i, j). If D(i, j) is the sixth data storage of M4, then the 
corresponding D(i, j) is M,(6). We would use M,() to express the set of shards 
stored by M,. This allows for better protection of the user's information and data, 
and prevents the exchange of content between miners as much as possible. 

We can effectively prevent malicious miners from sybil attacks or other attacks 
by slicing and replicating the data and storing them in a decentralized manner. We 
also require an appropriate specific request for sending shards to avoid joint attacks 
by miners. 


Definition 1 (Request of distributing shards): The distribution of the set 
(Di, j, k), Vi, j} to miners is subject to the following principles: 


1. The number of M, storing the data of D(i) cannot be less than half of J;. 

2. No miner M, will receive two or more storage requests for a single copy of data 
DiGi, j). 

3. No miner M, will receive storage requests for D(i, j) and DG, j + 1). 

4. Miners M, and My will not receive storage requests for D(i, j) and D(i, j°) 
together. 

5. No miner will store more than y copies of D(i). 

6. Miners M, and M, will store no more than z identical shards in the shards pool. 


This ensures that the data are stored in a sufficiently decentralized manner, with 
enough miners storing the data owner's data together, so that a single point of failure 
does not have a major impact on the overall storage. It also ensures that the user's 
data are not stolen in its entirety, guaranteeing the security of the data. Definition 1 
also makes the data stored by the two nodes different, avoiding outsourcing attack. 
We will specifically analyze the effectiveness of our algorithm in Sect. 4. 


3.3 Self-integrity Verification 


Now all data owner's data has been uploaded to each storage miner. We then need to 
continuously interact with all miners to ensure that data liveness is guaranteed and 


74 L. Lu et al. 


the data are being stored intact. This is the core of the work in this chapter. We have 
described in Sect. 2, the current auditing methods, both fully decentralized and not 
fully decentralized, that are able to do the job but not well in the accuracy of audit 
results or audit efficiency. This chapter proposes a solution that does not require the 
data owner's data to be compared and achieves self-auditing through self-comparison 
in the blockchain network, which substantially improves the long-term stability of 
the system. At the same time, our proof is more efficient and can quickly reach a 
consensus on the integrity of all data in a short period of time with responses from 
all nodes. We also allow data owners to initiate challenges and quickly check the 
integrity of their own data through the hash algorithm. 

The blockchain network audits whether the storage miners have correctly stored 
the corresponding data within a period T . To ensure timeliness, we use the last 16 bits 
of the hash value of the previous block as a random number ~. After getting -/, the 
miner has to upload the result of the hash operation of all his shards and ^ together 
with his signature M, sign within a specified time T. Now we obtain a new set: 
(hash (M40, N), M, sign) to express the result of the hash of all M,’s shards and 
its signature. It is important to note that the set is ordered, again according to the order 
in which M, stores the shards. The advantage of this design is that even when faced 
with a pile of results, the smart contract can determine the corresponding label (i, j) 
based on its position. We will use H (a, b) to denote the hash value corresponding 
to the bth shard of the miner a with J^, D(i, j). hash to denote the hash value of 
D(i, j) with “~ (Table2). 

After the storage miner M, has uploaded his/her (hash (Ma, N), M, sign], 
the smart contract will quickly determine if the number of H (a, b) is equal to the 
number of shards already stored by M4, and if it does not match, invalidate this 
result and demand M, to recalculate and upload the new result. If the result matches, 
the result is accepted and moves on to the hash pool. Next, the smart contract will 
compare the number of occurrences of all the results in the hash pool. If there are 
exactly k identical results, i.e., if there are k sets (a, b) s.t. all the results of H (a, b) 
are equal, then it will be decided that all the copies of the shard have been stored 
correctly. This would be the best result that can be achieved. All the storage miners 
need to store their data correctly for their own benefits. If all miners store correctly, 
all the nodes can quickly and accurately obtain the result that the data are stored 
intact. We will now determine whether the data are stored correctly based on the 
occurrences of each hash value. 


Definition 2 (strong integrity): The number of occurrences of D(i, Jj) hash 
is exactly equal to k. 


Definition 3 (weak integrity): The number of occurrences of D(i, j)_hash is 
greater than or equal to 2 and less than k. 
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Table 2 Notations for 


: ; ath Symbol Notations for 
operations/implications 


operations/implications 


U Data owners 

M Storage miners 

A Audit miners 

Ui The jth data owner 

D(i) Data owner i's data 

Di, j) The jth shard of U;'s data 

k The number of copies 

Ji The number of U;'s shards 

DiGi, j,k) All of the U;’s shards 

M, The ath storage miner 

(i, j) The label of D(i, j) 

M,(6) The sixth shard stored by Ma 

Ma0 The set of M,’s storage 

T Cycle time for storage 
miners uploads 

N The random number set by 
the user 

MN The random number from the 


previous block 


Ma_sign M,’s digital signature 
H (a, b) Ma (by's hash value with VW 
D(i, j) hash D(i, j)'s hash value with A^ 


If all shards achieve strong integrity, we can assume that the storage network has 
stored all data correctly and that all nodes would agree on this. If all shards achieve 
weak integrity, we can assume that all data are stored securely on the storage network. 
Weak integrity is a lower requirement for data availability in storage networks. During 
auditing, it is more of a constraint on the miners, so strong integrity is what is required 
by distributed storage networks. 

We will now discuss what to do if strong integrity is not achieved. If the number of 
occurrences of a hash value is greater than k, the possible scenario is that the miners 
are jointly misbehaving with each other and copying the same result for output. This 
is because when the storage miner receives the shard corresponding to that result, no 
other shards are received, and only if the miner has stored other miners’ shards. In 
this case, the k + o results are assigned a number (i, j) based on their location, and 
the numbers are then compared to find the miner with the incorrect result by audit 
miners A. The first step is to find the set of {(i’, j’)} corresponding to the wrong 
hash value, and then check whether the number of occurrences of hash value is k. If 
it is k, the shard has been completely stored in the storage network. Otherwise, this 
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number can only be less than k, if so, A needs to find which miners did not upload 
the right value and ask them to upload in time. If they upload the wrong results, ask 
them to re-store correctly to solve the problem. 

In fact, it is more often the case that the number is less than k. In case when the 
hash values whose number is less than k, we need the corresponding miners to upload 
proofs of the correct storage of the corresponding D(i, j). The following results may 
occur: 


1. The miner correctly stores D(i, j) and uploads the correct hash result. 
2. The miner correctly stores D(i, j) but uploads the wrong hash result. 
3. The miner incorrectly stored D(i, j) but uploaded the correct result. 
4. The miner incorrectly stored D(i, j) and uploaded the wrong result. 


Audit miners A need to immediately copy D(i, j) to ensure that they couldn't 
be lost. After that, A will handle errant storage miners as above. Such handling 
effectively avoids errors caused by miners offline. We will also judge storage miners 
who make frequent errors as malicious miners. If for the same D(i, j), all the results 
of the hash operation are different or it is not possible to distinguish the correctness 
of the result, then A can ask all miners storing the D(i, j) to recalculate it with the 
random number N and compare it with the result calculated by U;. In time, copy the 
data of the miners that output the correct result and ask U; to re-add another random 
number N to the calculation and keep the result for future use (Fig. 3). 

The above is the process by which a blockchain storage network audits of its own. 
This process allows for quick consensus to be reached under the condition that all 
the data are stored correctly, as well as finding malicious nodes if consensus is not 
reached. 


3.4 Data Owner's Integrity Verification 


After the data owner gets .//, he/she can also get a set of hash values H (i, j) generated 
by U; by performing a hash operation on his/her own data shards D(i, j). Smart 
contract will look for k occurrences of these values in the hash pool to determine 
whether his/her data have been stored completely. If exactly each result occurs k 
times, then it is almost certain that U;'s data has been stored correctly. If not, then 
the storage miner in problem can be found quickly and the data copied by the audit 
miner in his/her storage in time. Such an audit approach improves the shortcomings 
of self-integrity verification and increases the accuracy of data integrity verification. 


3.5 The Game of Miners Versus Storage Networks 


Storage miners can only earn if they store the user's data correctly and upload H (a, b) 
correctly. If the miner wants to earn without storing correctly, he needs to join with 
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Fig. 3. Audit algorithm 


other miners. The miner does not know the number (i, j) of the data he is storing, 
so he needs to send a request to all miners in the network. And other miners can 
be rewarded by reporting those malicious nodes. The union of storage miners does 
not earn a reward, only the individual fulfillment of the storage function makes the 
storage network maximize its benefits. For audit miners, audit miners are only given 
the appropriate audit access if there is a problem with the storage miner. Audit miners 
are only able to earn more rewards by continuously completing audit tasks and tasks 
delegated by data owners. These ensure that all miners are driven by profit to achieve 
stability. 
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Thus our system satisfies incentive-compatible property and also data integrity, 


recoverability, publicly verifiable, and auditability. The satisfaction of the five prop- 
erties is obvious. These are the same properties that filecoin satisfies. We can say 
that our audit proof is reasonable. 


4 


Fault-Tolerance Verification 


We now describe three attacks that are common in distributed storage networks. 


Sybil Attack (Douceur, 2002): Sybil attack is a type of attack in peer-to-peer 
networks in which a node in the network operates multiple identities actively at 
the same time and undermines the authority/power in reputation systems. In a 
distributed storage network, a malicious miner can create multiple sybil identities 
pretending to store many copies in order to be rewarded, but only one copy is 
stored in his local. 

In our proof, a miner cannot claim to have stored multiple shards, as the number 
of shards per share is limited to k. Meanwhile, there is a little additional gain for a 
malicious miner to pretend to store multiple copies by creating multiple identities. 
Since each miner stores different content and for two storage miners, they have 
the number of the same shards less than z. We control the revenue in such a way 
that storage miners will not receive enough benefit in creating a witch identity, 
making them less likely to take risks for it. Subsequently, we can limit such a 
situation even further by monitoring IP address, generating M, () proofs, etc. Such 
a scenario makes sybil attacks much less profitable. 

Outsourcing Attack: By relying on fast access to data from other storage providers, 
malicious miners promise to store more data than they can actually store. 

If a malicious miner wants to launch an outsourcing attack, the miner cannot 
know the shard number and can only determine if there is an overlapping shard by 
sharing the miner's H (a, b) set with each other; if there is an overlapping shard, 
the hash result can be quickly retrieved later in the audit. But the benefit to the 
provider is weak, and the inclusion of an exposing mechanism keeps miners from 
going to extremes for the weak benefit. So we can conclude that the benefits of a 
small number of miners cooperating are much less than the risks associated with 
incomplete storage. 

Generation Attack: Malicious miners claim to have more storage than they actually 
have through a small program to gain a greater advantage in the mining competi- 
tion. 

With slicing and cryptography, miners cannot effectively generate data with small 
program. The generated proof results need to be computed by hash function, and 
a small change can lead to a huge difference in results. There are strict penalties 
for generation attack in Filecoin, so this attack can be substantially avoided from 
an incentive point of view. 
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5 Concluding Remarks 


In this chapter, we focus on current research on auditing and point out the imper- 
fections of current auditing. We also analyze the audit requirements for Filecoin and 
redesign an audit algorithm for it. The algorithm determines whether the data have 
been stored intact in the storage network by comparing the results in the hash pool 
by means of storage miners uploading the hash results. The audit miner is set to a 
semi-participating state and will only join in time to gain access if a problem arises. 
Such an auditing algorithm is relatively accurate and secure for decentralized storage 
networks. Besides, it is obtained that the algorithm is highly fault-tolerant. 

Our algorithm is not yet well designed in terms of incentives and needs to prove 
that the algorithm can be put into widespread use. Incentives are a key part of getting 
the algorithm used, and it is important to play the game between miners and the 
storage network so that both sides can get the optimal solution for their interests. 
The regulation of the data is also something that needs to be considered in the next 
phase. Our algorithm needs to be more complete in the future. 
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Abstract Traditional iterative learning control (ILC) algorithms usually assume 
that full system information and operation data can be utilized. However, due to the 
uncertainty and complexity of actual systems, it is difficult to access full system 
information and operation data accurately and completely. In this chapter, a novel 
ILC scheme based on stochastic variance reduced gradient (SVRG) is proposed. This 
scheme is not only suitable for resolving the incomplete information problem, but 
also converges efficiently under both strongly convex and non-strongly convex con- 
trol objectives. To demonstrate the advantages, this chapter studied two scenarios, 
i.e., random error data dropout and model-free data-driven approach, and proposed 
two SVRG-based ILC algorithms for these two scenarios, respectively. It is theoret- 
ically demonstrated and experimentally verified that the proposed SVRG-based ILC 
scheme converges faster than both the full gradient and stochastic gradient methods 
for the two involved scenarios. 
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1 Introduction 


1.1 Background 


Iterative learning control (ILC) is a control method applicable to systems doing 
repeated operations. The basic idea is to use the input and error signals from previ- 
ous iteration to improve the input of the next iteration. Arimoto et al. first proposed 
iterative learning control for robotic arms in 1984 and clarified the basic idea of 
iterative learning control (Arimoto et al., 1984). Subsequently, academia has pub- 
lished numerous chapters around ILC. It has gradually become one of the important 
branches in the field of control and is widely used in robotics, industrial production 
and hard disk manufacturing, and other controlled systems with repeated operation. 

To achieve excellent control performance, most ILC assume that full operational 
data and system information can be obtained and utilized. However, in real systems, 
data delays and dropouts often occur due to various uncertainties. On the other 
hand, when the system structure is complex or unstable, it is difficult to obtain the 
system information accurately. To solve the incomplete information problems, it is 
of great theoretical and practical significance to design ILC algorithms with high 
performance. 

Information incompleteness can be classified into two categories, objective and 
subjective incompleteness. Information incompleteness caused by objective factors 
is often related to the uncertainty of the system itself. For example, during the trans- 
mission of the signal, the instability of the channel can cause data packet loss. Three 
main random packet dropout models have been developed for this problem: the ran- 
dom sequence model, the Bernoulli distribution model, and the Markov chain model. 
Shen (2018) designed an iterative learning control algorithm based on the stochas- 
tic approximation algorithm corresponding to the three models and proved that the 
algorithms satisfy mean-square convergence and probabilistic strong convergence. 
Information incompleteness due to subjective factors usually artificially assumes that 
the system information is unknown, thus avoiding the complexity of system model- 
ing and system instability. For example, Oomen et al. (2014) designed a model-free 
data-driven iterative learning control algorithm for H,,-parametric estimation of 
multi-input multi-output (MIMO) systems, which obtains the full gradient by con- 
ducting no x n; experiments on no x n-dimensional MIMO systems. However, this 
algorithm is difficult to be applied to large MIMO systems due to the excessive num- 
ber of experiments. Subsequently, Aarnoudse et al. (Owens et al., 2009) designed an 
iterative learning control algorithm based on the stochastic approximation method 
by constructing a random matrix to estimate the gradient, which effectively reduces 
the number of experiments. 

Itis important to note that the effect of information incompleteness on ILC track- 
ing performance is essentially the robustness of ILC. However, this robustness differs 
for objective and subjective-type information incompleteness problems. The former 
is usually model-based and emphasizes modeling to analyze the causes of informa- 
tion deficiency. While the latter is data-based and is generally not concerned with 


Iterative Learning Control Based on Random Variance ... 83 


the causes of information deficiency but with the inherent limitations of information 
deficiency on control performance. Model-based and data-based control methods are 
not opposed. To achieve the best control effect, the two control methods can also 
be used in combination. Existing studies on ILC for solving the information incom- 
pleteness problems are usually based on stochastic approximation method or other 
gradient methods. In this chapter, we will use a stochastic variance reduction gradi- 
ent (SVRG) method to give a general framework for solving the system information 
incompleteness problems. 


1.2 Design and Analysis of SVRG-Based ILC 


Modeling the control objective as an optimization function, for a deterministic 
discrete-time linear system, Owens et al. (Aarnoudse & Oomen, 2020) proposed 
a gradient-type ILC algorithm based on optimization ideas and analyzed the sta- 
bility, monotonicity, and robustness of the algorithm. For noisy discrete-time linear 
systems, Yang and Ruan (2017) proposed an enhanced gradient-based ILC algorithm 
that can effectively converge in the presence of perturbations in the system. However, 
the above gradient-based ILC algorithm requires full error and system information 
for each iteration, and when this information is not fully available, the traditional 
gradient-based ILC algorithm is no longer applicable. 

Notice that in Machine Learning, Stochastic Gradient Descent (SGD) method 
replaces the total gradient by randomly selecting a partial gradient each time. Corre- 
sponding to the control problems, the partial gradient can also be obtained when there 
is insufficient information about the error or the system. This correlation inspires us 
to find suitable stochastic gradient methods to solve errors or system information 
insufficient problems. 

In order to improve the convergence speed and apply to non-smooth and non- 
strongly convex objective functions, recent research in Machine Learning has pro- 
duced a large number of improved versions of stochastic gradient descent algorithms, 
including momentum method, variance reduction method, and incremental aggre- 
gated gradients. Allen-Zhu (2018) divided these algorithms to three types according 
to their complexity under strongly convex conditions. The first generation is the 
momentum-based gradient algorithm, the second generation includes the variance 
reduction-based gradient algorithm and the proximal stochastic variance reduction 
gradient algorithm, the third generation includes the Katyusha algorithm and incre- 
mental aggregated gradient algorithms. In most cases, the complexity of the algo- 
rithms decreases with the growth of generation. Considering specific control prob- 
lems, the algorithms in first generation are slow to converge and often fail to meet 
the practical needs, while the algorithms in third generation require accurate system 
modeling to achieve faster convergence and are difficult to apply to data-driven ILC. 
Therefore, the research in this chapter is mainly based on the algorithm in second 
generation— Stochastic Variance Reduction Gradient (SVRG) algorithm. 
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1.3 Main Work and Organization 


The purpose of this chapter is to construct SVRG-based ILC and use this framework 
to solve specific information incompleteness problems. As representatives of objec- 
tive and subjective information incompleteness, two scenarios, error data random 
dropouts and model-free ILC, are selected in this chapter to give the corresponding 
SVRG-based ILC algorithms, respectively. The contribution is threefold. 


1. Propose a SVRG-based ILC framework for single-input single-output (SISO) 
systems. The algorithm is shown to converge linearly under smooth and strongly 
convex conditions. 

2. Apply the SVRG-based ILC framework to error data random data dropouts and 
give the convergence proof of the algorithm. 

3. Extend the SVRG-based ILC framework to multi-input multi-output (MIMO) 
systems in model-free data-driven scenario and prove the convergence of the 
algorithm under smooth and non-strongly convex conditions. 


Section 2 serves as the basis of the chapter, giving the SVRG-based ILC framework 
for SISO systems. Section 3 applies the framework to error data random dropouts 
problem. Section 4 extends the framework to MIMO systems in model-free scenario. 
Since Sect.2 only gives the algorithm framework and does not cover the specific 
scenario, Sect. 2 does not give numerical simulations and contains only three parts: 
system description, algorithm design, and convergence analysis. Both Sects. 3 and 4 
include four parts: system description, algorithm design, convergence analysis, and 
numerical simulation. 


2 SVRG-Based ILC Framework 


As the basis of the following sections, this section uses SISO systems to give the 
basic framework of SVRG-based ILC algorithm. This section includes three parts: 
system description, algorithm design, and convergence analysis. 


2.1 System Description 


Consider the following single-input single-output (SISO) discrete-time linear system 


x(t + 1) = Axı (t) + Buy (t), 


1 
je esq t. 0) 


where t = 0, 1,..., N — 1 is time index, and k = 1,2,3,... denotes the iteration 
index. x, (t) € R”, ux (t) € R, and y, (t) € R represent the system state, input and 
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output, respectively. A € R"*", B € R”, and C € R!*" are the system matrices. The 
initial condition is the same for each iteration, i.e., x, (0) = xo, Vk e N*. 
Taking t = 0, 1,..., N — Lin (1) yields 
yx (1) = CAx (0) + CBu,(0) = CBuj(0) + CAxo, 
y. (2) = CAx,(1) + C Bu.(1) = CABu,(0) + C Bu(1) + C A2xo, 


ye(N) = CAxX(N — D 4-CBu(N — 1) 
= CAN! Bu,(0) + CAN? Bu, (1) +--+ 
+ CABu,(N —2) + CBuy(N — 1) + CAP xg. 


Combining the above equations, system (1) can be rewritten in the following equiv- 
alent form 


ye = Huy + Kxo, (2) 


where ug = [ux (0), ux (D), -..,ue(N — DT € R”, ye = Di (D, y&Q)... (NDI? € R^, 


hy 0-0 CB 0 > 0 CA 
ha he 0 CAB CB `.. 0 CA? 
= LN : E . ] . ,K = . 
hy, hy2--: hyn CAN-!p CA^-?B --- CB CAN 


For further analysis, the following assumptions are required. 
Assumption 1 The input/output coupling matrix CB # 0. 


Assumption 2 For desired trajectory ya(t), there exists a unique desired input u (t) 
and initial state x; (0) such that 


xa(t + 1) = Axa (t) + Bua(t), 


(3) 
ya(t) = Cxa(t). 
Also written in the form of (2), we have 
ya = Hug + Kxq(0). (4) 


Remark 1 Assumptions | and 2 describe the realizability of the system for desired 
trajectory yg. To be specific, Assumption | means that the relative degree of the 
system is 1. Assumption 2 describes the existence of an input signal uq that can 
precisely trace y4. If the system does not satisfy the Assumption 2, the system output 
can only be as close as possible to the desired trajectory yy. 
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Fig. 1 Block diagram of 
ILC 


Transmission Transmission 
channel channel 


Controller 


Assumption 3 The initial states of (1) and (3) are identical, i.e., x;(0) = x4(0) = 
xo, Vk. Assume that xo = 0. 


Remark 2 Assumption 3 is based on the requirement for system repeatability in 
ILC. In order to simplify the algorithm, without loss of generality, take xo = O. It is 
easy to verify that the result of this chapter is also valid when xo Æ 0. 


In this chapter, the above three assumptions will be followed, but in fact, the 
SVRG-based ILC can also be established when Assumptions 1 and 2 are appro- 
priately relaxed. Section 4 will give specific explanations on how to relax these 
assumptions. 

Figure | illustrates the basic framework of ILC. The plant takes input ug and 
generates output y, and gets the error e; = yg — y, between the output and the 
desired trajectory yg, which is transmitted to the controller. The controller uses error 
ey and input uz to calculate the input signal u,+; for the next batch and transmits it 
to the plant. Our goal is to find a sequence of input {uz}, s.t. 


lim [e] = lim |lya — yell = 0, (5) 
k—oo k—oo 
where || - || is the vector 2-norm and its induced matrix norm, and henceforth refers 


to this norm if not otherwise specified. 
By Assumptions 2 and 3, (5) is equivalent to the optimization problem of function 
F: 


A l 2 1 2 4 
F (ux) = IN llel = IN llya — Hull im F (u) =0. (6) 


2.2 Algorithm Design 


The traditional gradient-based ILC updating law (Gu et al., 2019) is 


Iterative Learning Control Based on Random Variance ... 87 


Uk+1 = Uk — NEVE, (7) 


where 7; denotes step length, and V; is the gradient of the objective function. From 
(6), we have 


1 
Vi = VF (uj) = -gT (8) 


In (8), calculating the full gradient requires all the information of conjugate matrix 
HT and error ez. To give the gradient under partial information, consider decom- 
posing the error e; = [e,(1), ex (2), ..., ei (N)]T according to the time index. Let 


: ; 2 
fi ux) = 5 lee@IP = $ (yali) — ATu), where hi = [hi,..., hi, 0,..., 01” 
denotes the i-th row of the matrix H. Then, equation (6) can be rewritten as 


1 n 
FG) = 2 fi (un). (9) 
i-l 
Take the gradient of both sides, we have 
VF (uy) Sx ) 53 hiex(i) (10) 
u = — i M = — —njexl). 
i N i N i 


Define random gradient V; as a discrete random variable that takes value uni- 
formly over (V f; Qu) satisfying P (v. = Vfi u) = x Therefore E [v] = 


i=)? 
Vx, ie., Vi is unbiased estimation of Vx. 

Note that by decomposing (6)—(9), calculating the specific value V f; (ux) of ran- 
dom vector Vy only requires one row of the system matrix H and one-dimensional 
information of the error eg. Therefore the decomposition can effectively reduce the 
information required for each iteration. This technique of gradient decomposition is 
the basis for solving the ILC of information incompleteness using SVRG method in 
this chapter. In Sects. 3 and 4, two specific decomposition methods are presented for 
incompleteness of error and system information, respectively. 

Consider the stochastic gradient descent (SGD) method used in Machine Learn- 
ing. Replacing the full gradient V; with the stochastic gradient V; in the ILC updat- 
ing law (7), we can obtain the SGD-based ILC algorithm. However, the conver- 
gence rate of SGD algorithm is O(1//k) even under strongly convex condition 
(Allen-Zhu, 2018), which cannot meet the practical requirements. This is because 
although the stochastic gradient V, is unbiased estimate of the full gradient Vx, 
the variance accumulates as the iteration increases. To reduce the variance, Johnson 
and Zhang (2013) proposed a general stochastic variance reducted gradient (SVRG) 
descent method. By recording a "snapshot" à? every few updates to construct an 
converging upper bound of the gradient, the rate of convergence of SVRG method 
is O (o) under strongly convex condition and O(1/k) under non-strongly convex 
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condition. Based on this method, the input updated with “snapshot” i? is denoted as 
Uus k, and the SVRG-based ILC updating law is 


Us k41 = usk — N (V fi (usx) -Vfi (iP) + VF (i) i (11) 


For system (2), the SVRG-based ILC algorithm with updating law (11) is shown 
in Algorithm 1. 


Algorithm 1 SISO SVRG-based ILC framework for SISO systems 
Input: 7,u00; 
m < 2N; ù? < uo,0; 
for s < 0 to S — 1 do 
uso <u’, us «— VF (i); 
for k < 0 to m — 1 do 
wk < V f (us,x) — V fi (i5) + us; where i from (1, 2, ..., N} randomly 
Us k+1 $7 Us,k — Wk 
end for 
Option I: i5 i pd Us ki 
Option II: | i5*! < usm; 
end for 


Algorithm 1 has two loops. The outer loop updates the “snapshot” à? once when 
the inner loop iterates m times. The iteration length m is taken as an integer multiple 
of N, which is empirically set to 2N. Line 9 and 10 of Algorithm 1 shows two ways of 
updating the "snapshot", Option I and Option II. Option I takes the average of the first 
m — | inputs as the "snapshot", without using us m, so actually the inner loop only 
requires m — 1 iterations. The corresponding Option II takes the mth-iteration and 
USES Us as the "snapshot". The two "snapshot" updating methods do not change the 
convergence of Algorithm 1 (Bottou et al., 2018). Due to the limitation of space, we 
only prove the convergence of Option I under strongly convex conditions and Option 
II under non-strongly convex conditions in this section and Sect. 4, respectively. 


2.3 Convergence Analysis 


This subsection is divided into two parts, first giving the convex optimization knowl- 
edge required for the proof of this chapter and then analyzing the convergence of the 
system (2) when the "snapshot" update method of Algorithm 1 is set for Option I. 


2.3.1 Preliminaries of Convex Optimization 


The basics of convex optimization required for this chapter are given below 
(Lyubashevsky, 2005). 
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Definition 1 (Smoothness) Suppose S is an nonempty convex subset of R2, f : S > 
ReClIf3L-0,stVx,yeS, 


IV £6) = VfO)I s Lllx — yl, 


then we say that f is L-smooth or V f (x) is L-Lipschitz continuous on S, where L 
is the Lipschitz constant. 


Definition 2 (Strong convexity) Suppose S is an nonempty convex subset of IR, 
f:SRe C!. If 3o > 0, s.t. Yx, y € S, 


fO) 2 fŒ) + (V fE), y- x) + zh - yl. 


then we say that f is o-strongly convex on S. When o — 0, f(y) => f(x) + 
(V f (x), y — x), f is convex. 


Definition 3 (Conditional number) If f is L-smooth and o-strongly convex, 
k = L/o is the conditional number of f. 


Theorem 1 For convex function f, the followings are equivalent: 
a. V f (x) is L-Lipschitz continuous, 
b. fo) x fG) + (Vf G0. y —x) + Fly — xl’, 
c fO) = f+ (VFR), y — x) gzllIVf 0) — Vf GOV, 
d. IYF) -YEO < (VF) - Vf). x — y). 


Proof a — b : Denote g(t) = f(t(y — x) + x), then f(x) = g(0), f(y) = (1), 
and g'(r) = (V f (t(y — x) +x), y — x). Therefore, 


f) — fo) - (V fo). y — x) = gC) — g0) — (VF), y — x) 


1 


I J = (V f (x), y — x) 
0 
1 


z foreo “PEA E 


0 


1 
Ec f IVF — x) +x) - VFN lly — xlidt 
0 


1 


L 2 
< [p £y —23 - ly — xlldt = zly = x. 
0 


b — c: Denote f,(z) = f(z) — (V f (x), z), for Vz, z € RI, 
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F()-foslvfo. 2-3 |z-. 
F(0)- fo-(vfo.2-3s(vf- vfGz-2*z Ez d. 
&G)-&os(v&o:-)*2|z7-:. aD 


By using the convexity of f, we have 


fi: (2) -— A@ = f (2) — F@ —(VF(@), zz) 
z(VfG.z-z)-(VfG).z-z)-2(VfG.z-z). 


Therefore f(z) is also convex, since V f(z) = V f(z) — V f (x), f(z) achieves 
its minimum at z — x. By (12), 


: / , , I 2 
f.) = min f, (7) «min [fO (v. -g*zlz zl | 
= fs(z) + min min |; (V f(z), y) + žr] 


lyll=1 +20 


[- (V Fe), yY | 


= f(z) + min 2L 


Ilyl=1 
= fO - SIV AIP 
ni 2L EUM S 
Therefore f(z) — f(x) = + IV f l, which implies 


ff) — f@) — (Vf). y= x) = fe) — fe) 
1 2_ 1 2 
a IVAO = ap lY) —- VfW. 


c — d : Swapping x and y in c. , we have 
fœ) >= fO) ++ (YFG), x-y) + FIV fo) - VfG)IP. 
Summing the two equations, we have 
FIV FO) -VfG)l? x (VfG) - VFO). x - y). 


d — a: IVf) - VfGDI? € L(VfG) - Vf), x - y») x LIVfG) — VfO)II - lix — 
y|| by Cauchy inequality, thus || V f(x) — V £ y) x Ll|x — yll. 
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Theorem 2 Let f (x) — jxT Qx + q! x + c, where Q is positive definite. Then f (x) 


is L-smooth and o -strongly convex, where L = Xy, and o = hy. àm, and Àm are 
the maximum and minimum eigenvalues of Q, respectively. 


Proof Since V f (x) = Qx + q, we have 
IV. fe) = VfO)I x QE — » x NOM lx — yl = Amie — yll. 
Hence f (x) is Ay-smooth. It is easy to verify that 
1 
F(x) - fo) - VFO), x - y) 2 56 a- y). 


Since Q is positive definite, the orthogonal similarity can be diagonalized as Q — 
PTA P, where P is the orthogonal matrix, A is the diagonal matrix of eigenvalues, 
and Àm > 0. Thus 


1 1 1 1 1 
z“ = y) O(x —y)- gt Az = 2 24 z 5^nllll = 5^nllx = yl". 


Thus f (x) is 4,,-strongly convex. 
For system (2) and objective function (6), we have: 
Proposition 1 Each f; is convex and L-smooth. 


Proof For f(x) = 1 (ax + cy, where q = [41, Q2, .... qn]. € R",x eR",ce 
IR. Obviously, f is convex, and Vf(x) 2 qq! x cq, ||Vf(x) - Vf) = 
laa” Œ- »)] < faa? ||- Mx yil < lla [P] x — yll. Therefore, for each f; in (9), 
let L = max; (lA; ll?) > 0, hi = [hi ..., hi, 0, ..., O]7, then fi is L-smooth. 


Proposition 2 F is L-smooth and o -strongly convex. 
Proof By (6), we have 


1 
F (ux) — AN (ya — Hux)" (ya — Hux) 


1 
E (uj H? Huy — y Huy — up H" ya + y ya), 
where HTH is positive definite. Then by Theorem 2, F (ux) is L-smooth and o- 
strongly convex, where L and c are the x of the maximal and minimal eigenvalues 
of HT H, respectively. 


2.3.2 Proof of Convergence 


In Algorithm 1, set the “snapshot” updating as Option I. Then, under the assumptions 
of system (2), the convergence of Algorithm 1 is given by the following theorem. 
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Theorem 3 /f each f; is convex and L-smooth, and F is o-strongly convex. We 


denote the optimal point u* = argmin, F(u), and assume that m is large enough 
such that 
1 2Ln 


= anl -—2Ly)ym  1—2L) 


Then the convergence of Algorithm 1 satisfies 


[F (8°) - F (u*)] < a° (F (8°) — F (u*)) 


Proof Since f; is convex and L-smooth, for any i, by Theorem 1, 


[VAW — V fi (u*) ||” < 2L[fi(u) — fi (u*) —(VAi (u*),u—u*)]. a» 
Since + X; V fiu) = VF (u), and VF (u*) = 0, we regard V f; as random vectors 


which take values from (V f;}®_,, then 


E [|V ia) - V f; (u*) "|= LY vAo - v f; (u*) |? < 21 [Fao — F (w*)]. 
i=l 
(14) 
For any fixed s, we set wy = V fi (us, k) — V fi (05) + VF (i), then 


[Ig] < 2E [Iw (s) = v f Gp] 
+ 2E [Iv f (E) - v fi Qe) 
= 2 [|V i (usa) - vfi (^) 
+Æ [EVA E) - v. (0)] -E[v GP) - V YI] 


Ivi (usa) - v (u") [7] æ [IV G9) - VAI? 
< AL [F (usx) — F (u*) + F (i) — F (u*)]. (1) 


vi 


IA 
N 
| | 


In the above, we have used the inequality ||a + b||? < 2lla||? + 2\|b||, and the 
property z [Ie — zc I] = E|¢||? — ||Ec||? < E|z|?, as well as (14). Notice that 
z [we] = VF (us,x), thus 
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z [Iw] 


= ||us k — u* 22n D [(we, usk — )] + n? 
< Us k — u* i ind 2n [VF (usx) s Usk — u*)] 
t AL [F (use) - F (u*) + F(a’) - F (u*)] 


< Usk — u* 


+4Ln’ [F 


(usi) —- 


Us k — U” 


+4Ln° [F (i) — F 


In the above, we have used (15) and the convexity of F, i.e., (VF (usx) , 


F (u*). 


F (usx) — 


We sum up the expectations of the above equation for k = 0, 1, .. 


5 — 2n [F 
F (u*) + F (i6) — 


? - 2n — 2Ln) [F (usx 


(usx) — F (u*)] 


(«*)]. 


F (u*)] 
)= F(w)] 


Usk — u*) = 


,m — 1. Using 


the convexity of F and the selection of +! under Option I, we llave F (ü de 


1 Dp 
F(t Pio usx) < = x bor 0 F (u Us,k 


) . Therefore, 


[F (=) 


= F (u*)] 


[usm = || + 2n( = 21m 


| 


IA 


) [uso — u* Hl +4Ln?m 


[F (8°) — F (w*)] 


= 


[F (860) — F (u*)] + 4Ln?°m 


us 


Summing up the above equation for s = 0, 1 


2 
o 


E [F (a) — F (u*)]. 
Thus, we obtain 

2Ln 
o (1 s 1 m) 
[F (i^) — F (w*)]. 


— S — 1, we have 


[F G0) - F (u*)] < e E[F (4) - F (u*)]. 


Remark 3 Theorem 3 indicates that Algorithm 1 has the rate of convergence O (a 5) : 
And this convergence rate is related to the value of a. If the firumanon oF the system 
H is known, to make o as small " possible, we generally take n = =, m = O(n), 

so that the value of o is close to 5. If the system information is ‘innit we need 
to find the appropriate 7 and m by experiment. 


[F G0) - F («*)] 


o [F (ù+!) 


m 
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3 SVRG-Based ILC Under Random Data Dropouts 


This section follows the SISO system in Sect.2, but assumes that random data 
dropouts occur in the error signal transmission. This section consists of four parts: 
system description, algorithm design, performance analysis, and numerical simula- 
tion. 


3.1 System Description 


In SISO system (2), we still hold Assumptions 1—3, but assuming that data dropouts 
occur in the transmission of the error signal, as shown in Fig.2. We further assume 
that the dropouts satisfy the Bernoulli distribution model (Shen, 2018). Therefore, 
the ILC updating law (7) becomes 


1 
Uk+1 =u n Trew (16) 


where Ty = diag {yk (1), yk (2), ..., yu (N)]. OHE is i.i.d following Bernoulli 
distribution. Let y £E [vx (i)] be the successful transmission rate, where y(i) = 0 
means data dropout occurs in the i-th time of k-th batch, and otherwise, data dropout 
does not occur. 


Fig. 2 ILC with random 
data dropouts 


Transmission 


channel 


| cL 
Memory 
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3.2 Algorithm Design 


Based on gradient descent method, there are two main approaches to solve the data 
dropouts problem: 


(1) Obtain the full gradient by retransmissing. For each transmission, the controller 
stores the successfully transmitted data and asks the lost data to be retransmitted 
until all data are received. This method eliminates the effect of data dropouts by 
retransmissing. 

(2) Use successfully transmitted data to construct random gradient. The data is 
updated directly using successfully transmitted data each iteration, as shown in 
the update law (16). 


The first method requires a lot of wasted time when data retransmission is slow. 
Although the second method saves the time of data retransmission, the actual running 
time may be larger than first method when data retransmission. 

Based on the framework of Algorithm 1, the SVRG-based ILC under error data 
dropouts can be constructed by utilizing the second method for each iteration, but 
calculating the full gradient every several iterations using the first method. This 
algorithm does not require data retransmission in most cases compared to the first 
method and has a significant improvement in convergence speed compared to the 
second method. Thus it can achieve a good balance between convergence rate and 
data retransmission speed, and it is more suitable for general data dropout cases. 

The formal construction of the algorithm is given below. 

Firstly, we take the random gradient Vi £ AH TT. e,, and we note that 


£ [v] = -4H Te, = Vz. For the convenience of proof, we present Vi as 


~ 1 
VÄ u=- 5, Vfi(u), (17) 


{ily @O=]} 


where V f; is defined in the same way as (10). Notice that if there is no dropout, 
y = 1, and therefore (17) is equivalent to (10). 


Remark 4 Equation (17) is similar to the Batch Gradient Descent (BGD) method 
in Machine Learning, but they are fundamentally different. In (17), the number of 
V f; in each summation }_, V f; varies according to the value of the random vector 
{y(i ye 1- But in BGD, the number of V f; is fixed. Therefore, the algorithm based 
on gradient V F; cannot be directly applied to BGD. 


Secondly, similar to (11), ILC updating law under random data dropouts is con- 
structed: 


Us kat = Usk — n (VF. (usa) — VF (P) + VF (i6). (18) 
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Finally, change the iteration length m in Algorithm 1 from 2N to [2y N]. Because 


the number of summation in each batch is IE D yx G )| = y N inthe desired sense, 


V F, is equivalent to y N sum of V f;. 
In conclusion, SVRG-based ILC under random data dropouts is shown in Algo- 
rithm 2. 


Algorithm 2 Data dropout SISO SVRG-based ILC 
Input: n, uoọ.0; 
m «—2yN; Q? < u0,03 
fors < 0 to S — 1 do 
uso < iP, us 4 VF (i5); 
for k < 0 to m — 1 do 
Wk —- V fi (us.k) — V fi (1) + bss 
Us k+1 — Us,k — NWk; 
end for 
Prag ew l EXE Us ki 
end for 


For the "snapshot" of Algorithm 2, the update method is taken as Option I in 
Algorithm 1, and the recommended iteration length is set to [2y N]. When y is 
unknown, we need to find the appropriate m by experiments. 


3.3 Convergence Analysis 


By Proposition 1, every f; is convex and L-smooth, and the following proposition 
holds: 


Proposition 3 Each value of V Fy (ug) is convex and L'-smooth, where L' = L/y, 


and L is the Lipschitz constant corresponding to the smoothness of f; in Proposition 
i, 

Proof Since every f; is convex and L-smooth, i x V fi (uy) is L'-Lipschitz 
continuous, where L’ = L/y. Because each value of V Fy (ux) is a linear combina- 
tion of V f;, the summation number does not exceed a x V fi (ux). Therefore 


V f; (ux); as a result, VF, (ux) is convex and L’-smooth. 


For the convergence of Algorithm 2, we have the following theorem. 


Theorem 4 /feach value of VE, (uy) is convex and L'-smooth, F is L-smooth and 
o -strongly convex, and for the optimal point u* = argmin, F (u), assuming that m 
is large enough s.t. 


1 2L'n 
= + «1 
on(1—2L'm 1—2L'n 
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Then the convergence of Algorithm 2 satisfies 
[F (9) — F (u*)] < aë (F (9) - FQ). 


Proof By Proposition 3, replacing f; in the proof of Theorem 3 with VE, equation 
(13) is rewritten as: 


2 
= 


| v Fco ~ VÉ, (u*) 


35 WAW- fi(u) -(Vfi (v) u — u*)]. 


1 
N 
Y^ iln@=1) 


The corresponding Eq. (14) is 


1 l l 
SRE) 2, T ERRU -(V f Gn) wt] 
til G)-1) 
1 ] l 
SV EIE [F u) — F (u*) - (VF (u*), u — «*)]] 


= 2L'E[F(u) — F (u*)]. 


The rest of the proof repeats the proof of Theorem 3. 


Remark 5 Theorem 4 shows that Algorithm 2 also converges linearly, and the speed 
of convergence is related to o. For the choice of m, note that Theorem 4 differs 
from Theorem 3 in the Lipschitz constant corresponding to the smoothness of the 
condition. By Proposition 3, with L’ = L/y, for 


_ 1 2L'n 
"on(l-2L'pm 1-2L'q 


We can consider multipling m by y times, i.e., changing m from 2N to [2y N], to 
approximately keep the convergence of Algorithm 2. 


3.4 Numerical Simulation 


In SISO system (1), take the system matrix (A, B, C) as 


0.50 —0.25 1.00 0 
A-| 015 030 —0.50 |, B—|0|, C=[0010]. 
—0.75 0.25 —0.25 l 
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Take desired trajectory ya (f) = sin(27x t /50), time length N = 50, initial state x9 = 
0, and initial input uo = 0. When y = 0.9 and y = 0.6, the ILC based on full gradient 
(GD), stochastic gradient (SGD) and stochastic variance reduced gradient (SVRG) 
is shown in Fig.3. 

When calculating the full gradient, each data retransmission increases 1 to the 
iteration number, which means that the controller skips one round of computation 
until all error information is completely transmitted. Take the optimal step that the 
three methods can converge. 

When y — 0.9, Fig.3a shows that the SVRG-based ILC converges slightly faster 
than the GD- and SGD-based ILC. When y = 0.6, Fig.3b illustrates a significant 
difference in the convergence speed of the three types ILC, from fast to slow for 
SVRG-, SGD-, and GD-based ILC. In summary, the SVRG-based ILC under error 
data dropouts, i.e., Algorithm 2, outperforms the GD- and SGD-based ILC under 
different successful transmission rates, and the difference becomes more significant 
as the y decreases. 


4 Model-Free SVRG-Based ILC for MIMO Systems 


This section extends the Algorithm 1 in Sect. 2 from SISO systems to MIMO systems. 
Firstly, a system description of the discrete linear MIMO system is given. Secondly, 
the existing model-free data-driven methods are introduced, and a new model-free 
data-driven ILC based on SVRG method is constructed. Thirdly, the convergence 
of the algorithm under non-strongly convex conditions is proved. Finally, numerical 
simulations are established to verify the convergence performance of SVRG-based 
ILC in deterministic and noisy systems. 


4.1 System Description 


Consider the following discrete linear multi-input multi-ouput (MIMO) system /Z', 
which has q inputs uj, ur, ..., Ug, and p outputs y], yt ..., YE. Rewrite the system 
in the form of (2), 


yy = J Uk, (19) 
where 
Jit Siz +++ Jig Uj yi 
Ja J2: Jų u% Ve 
= s r : , Uk = d , Yk = 
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Fig. 3 Comparison of three 
gradient-based ILC under 
error data dropouts 


error |le, || 
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iteration number k 
(a) y=0.9 
6 r 
—— SVRG 


error |le,|| 


0 500 1000 1500 2000 
iteration number k 
(b) y — 0.6 


T 
Each J;; € RY*™ has the same properties as matrix H in (2). y; = bio. ss xq] : 


T 
u} = [i (0), ..., u} (N — D] , N is the length of time, and the desired trajectory 


ya [03 03)... 02] 


For this system, consider the following assumptions: 


T 


Assumption 4 System matrix Y 4 0. 


Assumption 5 The dimension of input signal does not exceed the dimension of 
output signal, i.e., p > q. 
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Remark 6 If p = q = 1, the system (19) degenerates to SISO system (2). Unlike 
Assumptions 1 and 4 can no longer guarantee that the system matrix Y is of full 
rank. Assumption 4 is the most fundamental, since if / = 0, any input signal cannot 
track yg. Assumption 2 is also relaxed from the system, the reasons will be given 
in the proof of the convergence. In addition, Assumption 5 is added for the MIMO 
system, because if the input dimension is larger than the output dimension, it means 
that there is a redundant information. 


For desired trajectory y,, consider control objective similar to (6), i.e., to find a 
sequence {ux}, s.t. 


1 1 
GG) = 5, lel? = ap 47 Fel *, im Gu) =G (w), 20 


where u* is the input when G takes the minimum value, and the error signal e; is 


1 
c]. fa 
Yk — Ya ek 

ek = yaq — Yk — : = 
p p p 
Yk T Ya ek 


4.2 Algorithm Design 


The full gradient of G in (20) is Vi = VG (ux) = — 5 A" (Ya — A ur). We need the 
information of Z7 to calculate the full gradient. However, in model-free learning, 
we want to obtain the gradient by conducting experiments on system / only. For 
this purpose, Oomen et al. (2014) gives the following method to estimate 9". 


Lemma 1 For SISO system J = Jii, its transpose Y" can be obtained by matrix 
multiplication 


(Ji)? = Ty Ju Ty, 


where T is the N-order permutation matrix whose anti-diagonal is 1, i.e., 


T= 


Therefore the full gradient of SISO system -5 (Jui)! e, = —5 In Jui Tyer can be 
obtained by a single experiment. 


Lemma 2 For MIMO system Y, whose transpose f is 
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Un ea) Ty 0 Ju c Jp Ty --- 0 

Jr=| pon: fapic dp ice: Jp ices 
(Au) «oe 0 --- TN Jig +++ Ing 0 --- Ty 

TAN S TPN 


For symmetric MIMO systems, — J # J, so the full gradient = KT ey of 
MIMO system cannot be obtained from a single experiment on system /Z. The 
method proposed by Oomen et al. (2014) estimates Z^ from pq experiments: 


q P 
Jom (EO FE Ve”, QD 


i=1 j=l 


where £ is a matrix consisting of q x p blocks. In £”, the (i, j) block is unit matrix 
of order N, and the remaining blocks are all 0: 


Oer 0.---0 
LÌ = | 0- Iy- 0 | e RPN, 
Qesr O ...0 


In (21), the left multiplication matrix £L” takes the i-th row of A, and the right 
multiplication matrix L” takes out the j-th column of Z. The two multiplications 
lead to a great loss of system information. We would like to improve the above 
method by extracting as much system information as possible. Therefore, consider 
the following decomposition as (9). 


2 
yy — 4-1 Jijuz|) . then (20) can be written as 


Ses (us) = 4 fet? = 1| 


1 n 
Gai) = - sup. Q2) 
i-l 
Taking gradient on the both sides, we have 
1 n 
VO) Vau), Q3) 
i-l 


where 
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Vgi (uy) = — ; e RW, 


Note that the V g; (ux) can be calculate by one line of the system matrix. The following 
lemma can help us design a controller to take out this information. 


Lemma 3 Calculating V g; (uy) only needs a single experiment. 


Proof First, we note that 


Ji Jic Sig 


Thier 0 
Joi J2 +++ Jog P * $5 Me 
[ere cn n E ok pe dude] enis 

: i A 0 «+. Th 

age Jpi Jp2 5: Jpq de ea 
— MM 
TAIN 
PA 


where 7,” N © RN*PN is the matrix whose i-th block is Ty and the rest blocks are 0. 
For e} € RY 
k ^ 


ei =[0,...,0, Iv, 0,..., O] e, 
——————— 
Ji 


where .Z; € R%*4% is the matrix whose i-th block is an identity matrix of order N 
and the rest blocks are 0. 

The matrix multiplication method can retrieve a row of information of the system, 
but it cannot directly obtain the matrix for further computation. Therefore, simple and 
easy-to-implement linear mappings are considered to change the matrix to suitable 
dimension. 

Set Ei € IR?'*4 and define a linear mapping V: 


ei 0 

Er. MEN 
Ye = ES | it ijb 

i 

0 e, 


where E h is the matrix blocked by N x 1, with ei on the diagonal and O in the rest 
of the blocks. 
Since l l 
TP" GTO A EE | eR, 


we can define the linear map ®: RY*4 — RI”, It maps matrix in RY*4 to RI by 
arranging each column of the matrix in order to a vector, i.e., 
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Fig. 4 Controller for 
model-free MIMO systems 


Memory 


T ji 
zie 
Jhe; 
T Toi] i2°k 
P [Jie Sigel =] 
T ji 
Jig hk 


Combining above, we have 


Vgi (uy) = —P JP” GT Lier. 


Thus, Vg; (ux) can be calculated in a single experiment. 


Based on Lemma 3, a controller can be designed as shown in Fig. 4. 

This controller can reduce the calculation of the full gradient in (21) from pq 
experiments to p experiments. However, when the system is noisy, there is no guar- 
antee that the partial gradient estimated for each experiment V g; (ug) all correspond 
to the same full gradient VG (ux). In nosiy systems, we can use the random gradient 


Vis which takes values uniformly {V g; (upi s.t. P (v. = Vg; u) = x. How- 


i=l 

ever, the SGD method converges slowly. Combining the convergence speed and the 
effect of noise in the system, we consider to design a SVRG-based ILC algorithm 
similar to Algorithm 1, as shown in Algorithm 3. 


Algorithm 3 Data-driven MIMO SVRG-based ILC 
Input: n, uo,0; 
m < 2p; ù’ < ug,o; 
for s <— 0 to S — 1 do 
uso «— US, p, <— VG (a); 
for k < 0 to m — 1 do 
wk «— Vgi (us, — ii’) + ps; where i from (1,2, ..., p} randomly 
Us,k+1 <— Us,k — NWk; 
end for 
s+! < üs,m; 


end for 


Algorithm 3 uses Option II in Algorithm 1 to update the “snapshot”. If m is twice 
as many as p, a total of 2p experiments are required for each internal iteration. 
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p experiments are required to compute the full gradient, so a total of 3p system 
experiments are required for each iteration. Compared with the SGD-based ILC, 
Algorithm 3 requires p more systematic experiments per m iterations to compute the 
full gradient in order to accelerate the convergence. 


4.3 Convergence Analysis 
From the following two propositions, we will see that G is not necessarily strongly 
convex. 


Proposition 4 G and g; are convex and L-smooth. 


Proof We note that 


Ji 

Jj "ue : 
Vg(x)2-|. xi - 9 Ju | = -37 oi Ax), 

JZ " 


where J; — [Ja Jio ios Jis] „XE RIN g; is convex, and 


IVg; (x) — Vgi 


»| 


" i-e -Ix — yll- 


JiJ; 


Let L = max; Dus Aij |, where À;; is the maximum eigenvalue of J Jij. Since 


Ji Ji; is always semipositive definite, A;; = 0 if and only if Jj; = 0. Hence by 
Assumption 4, L > 0, gi is convex and L-smooth. 
Since G is a convex combination of g;, G is convex and L-smooth. 


When p =q = 1, Z = Jj, = diag{1,0,...,0}, G is not strong convex. The 
following proposition gives a sufficient condition for G to be strongly convex. 


Proposition 5 Jf the system matrix J is of full column rank, then G is o -strongly 
convex. 


Proof By Assumption 5, p > q, and 


G (ui) -5- 5 (uh £7 Fuk- yp f uk — ut J’ ya Yaya): 


Iterative Learning Control Based on Random Variance ... 105 


Since ZT f is positive definite if and only if Z has full column rank. Therefore 
by Theorem 2.2, G (ux) is o -strongly convex when /Z has full column rank, and the 
strong convexity factor is 5 of the minimum eigenvalue of matrix 77 7. 


When G is strongly convex, we can prove that Algorithm 3 converges linearly 
similar to Algorithm 1 (Bottou et al., 2018). The convergence proof of Algorithm 3 
under the strongly convex condition is omitted because of limited space. We only 
give the convergence proof under non-strongly convex. 

First we have the following lemma (Reddi et al., 2016). 


Lemma 4 Assume that cx, cy 41, B > 9, 
Ce = ci (1+ nB 25? P?) + nPE. 


If n, B and cy44 are chosen such that 


Ck417] 
T, — ( T mL m) > 0, 


then each iteration of Algorithm 3 has an upper bound 


7 live (us,x) I") < Rsk eH 
k 


where Rsx È [G (us k) + ci [usx — à I. 


Proof Since g; is L-smooth, 


L 
gi (Us,e41) € gi (Usk) + (Vg; (Us,k) , Us,k+1 — usu) + El |us eia — Usk : 


For fixed s, let wg = Vg; (us) — Vg; (^) + up, then E[w;] = VG (us,x). 
Since Us x41 — Us,k = —NWk, We use the above equation and take the expectation on 
both sides to obtain 


L 2 
2[G )] < E [G (usx) — n [VG (ws) ] + SHEL]. 24) 


In addition, for | us eua —i |’ , we have 
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| Us k+1 — Ū` Hl 
=E I Us,k+1 — Us,k | + lus. — i | + 2 (us,p41 — Us,ks Usk — a") 


=E L lw; I? + lus, — i? I") — 2gE [(VG (us,x) , us, — à)] 


< E[n? el + an) 


=2 


3 


[s E E na = iG]. as 


In the above, we have used Young's inequality (x, y) < 2B xl? + B Elly ll’. 
For E [|| wel? 


A 


we have the following estimation: 


] 
[IP] = E [|] Vai Qua) — Var (8) + VG (8) Vf 
[Ives (a) — Vai (0) + VG (P) - VG (usx) + VG (us) l° 
< 2E [|| Vai (ust) — Vei (5°) - (VG (us) - VG (&°)) 7] 
+ 2E| |'VG (us) 
< 2E [|| Vi (use) - Vai (E°) |] + 28 [Ivo (use) || 


< 2E [lus - a [°] +2E[]VG (us) (26) 


where the inequality is given by |la + PI? < 2llal? + 2J|bJ?, z [Ie — Eç ||? 
zll? — l a" < E||£ ||? and the smoothness of gi, i.e., || V g; (us.x) — Vg; (ñ G °) II 
|us,k — d ||. 

Denoting R; £E |G (us,x) + Ck [us —i Il] by (24) and (25), we have 


c 


L 2 
Ras SE[G (us) — 0| VG (use) || + SHE [iwl] 


taa? va? + A 


1 
EIE VG (us) | +> É e T| 
< [6 (wa) -n(1- SE) Ive s.n] 
L 
+1 (2*«4) = [Iw] 


eu d MBE [Ius — |] 
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From (26), we have 


Rs y1 < E[G (us) ] + (cei (1 + n8 + 29° D) + n D)E [lus d I" 
( rs Ly? — 2e D s|||VG (ust) I] 


= Rsk ( x Ly e) [Ivo (use) |: 


Let T, ê (n -e PE — 2c?) , Ty > 0, then 


Rsk — Rsk 
[1v6 (a) r] s 5 
k 


Because of the complexity of non-strongly convex problem, we do not consider 
convergence criteria in Theorems 3 and 4 such as E [G (u) — G (u*)] < e, but instead 
proving E Hi VG(u) 1°] < € for Algorithm 3. Note that if G is ø -strongly convex, it 
is easy to verify that 


G(u) — G (u*) < 5 IVGGOI. 


Thus by z [IV Gao ] < £, we have IE [G(u) — G (u*)] x e. However, this rela- 
tionship does not always hold under non-strongly convex case (Ghadimi & Lan, 
2013). The following theorem gives the proof of the convergence E [i VG(u)|| "T <e 
of Algorithm 3. 


Theorem 5 Suppose each g; is convex and L-smooth, and G is convex. For0 < k < 
m — l, cx, Ck+1, B > 0, Cm = O satisfying 


Ck = ei (1+ 0B + 2:3? L’) + 9° L?. 


and n, p, cy44 are chosen such that 


Ck417] 
T, = ( D n L n) > 0. 


Let T = ming Tk, Ua be a uniformly distributed random vector with values (u; | 
O<s<S-—1,0<k <m -— 1}. Denote that u* = argmin, G(u), then Algorithm 3 


satisfies 
[o 6) cw 


Smty 


[I VG ()I?^] x 


Proof We take k —0,1,...,m — 1 in Lemma 1 and sum up to obtain 
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u Rio Rin E[cG)-c(r")] 


Y E[Ive (sl?) < - 


k=0 Tm Un 
By the definition of R,4, we choose Us o = i", Us,m = i ini. We take s = 
0,1,..., S— lin the above and sum up to get 
1 $—1 m—1 ; 
[IVG o)l] = — [VG (us) I | 


a |G ü) -G (a') | jc (a?) -G «| 


< 
SMT 


In the above, use the definition of u and G (ù`) > G (u*). 


Remark 7 Theorem 5 shows that the convergence of Algorithm 3 is O(1/Sm) 
under non-strongly convex conditions. The convergence of the corresponding SGD- 
based ILC under non-strongly convex conditions is O (1/4/Sm) (Johnson & Zhang, 
2013)). Moreover, the theorem states that the convergence of Algorithm 3 is only 
related to the step size 7 but not to the choice of the number of iterations m. Since 
the system information is unknown, 7 needs to be estimated by experiment. 


Remark 8 Theorem 5 indicates that G (uj) can approach the optimal value 
G (u*) = (Y/2p)|y4— £u" |’, i.e., uy can converge to the optimal input u*. Simi- 
larly, the Assumption 2 can also be changed to limz.o. F (ug) = F (u*) without 
affecting the convergence of Algorithm | and Algorithm 2. If F (u*) = 0, then 
lim, oo F (ux) = 0. 


Remark 9 System (21) degenerates to SISO system when both input and out- 
put are one dimension. At this point, the theorem indicates that when system (2) 
satisfies Assumption 4 (Assumption 1 need not to be satisfied), Algorithm 1 and 
Algorithm 2 still converge with updating the “snapshot” as Option II. But the con- 
vergence rate becomes O (1/Sm) when the objective function is not strongly convex. 


4.4 Numerical Simulation 


We take MIMO systems with 21 x 21 input-output dimensions randomly generated 
by the Matlab drss function (Aarnoudse & Oomen, 2021) and set the time length to 
N — 42. The desired trajectory y, is 0.025 in each dimension. Model-free ILC based 
on full gradient (GD), stochastic gradient (SGD), and stochastic variance reduced 
gradient (SVRG) are performed in Fig. 5. We take the optimal step of each algorithm 
after multiple experiments. 

As shown in Fig.5a, the GD- and SVRG-based ILC converge similarly for deter- 
ministic systems, and both are faster than the SGD-based ILC algorithm. 
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Fig. 5 Comparison of three 
data-driven gradient-based 
ILC under MIMO systems 


error ||, || 


0 200 400 600 800 1000 1200 
iteration number k 
(a) Deterministic MIMO system 


error Je, || 


0 500 1000 1500 2000 
iteration number k 
(b) Noisy MIMO system 


For randomly generated MIMO systems with input-output dimensions of 21 x 21 
and N — 42, we add Gaussian white noise to the system. 

From Fig.5b, we can see that both GD- and SGD-based ILC converge worse than 
SVRG type ILC when the system is noisy, and SVRG-based ILC can still maintain 
excellent convergence when the system is noisy. 
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5 Conclusions 


This chapter focuses on exploring ILC based on the SVRG method. Firstly, Sect. 2 
gives the basic framework of SVRG-based ILC and proves that the algorithm con- 
verges at a rate of O(o^) under smooth and strongly convex condition. Secondly, 
Sect.3 designs a SVRG-based ILC algorithm to solve random error data dropouts 
and proves that the algorithm converges linearly. Finally, Sect. 4 constructs a model- 
free SVRG-based ILC by improving the existing model-free algorithm for MIMO 
systems and proves that the convergence rate is O(1/k) under smooth and convex 
condition. Compared to the GD- and SGD-based ILC, two numerical simulations in 
Sects.3 and 4 verify that the SVRG-based ILC has superior convergence rate in both 
the random error dropouts and model-free contexts, respectively. 

It should be noted that the SVRG-based ILC framework given in this chapter is 
not only applicable to the random error dropouts and model-free problems but can 
also be utilized to solve other error or system information deficient problems by 
properly decomposing the control objectives. Future research includes comparing 
its advantages and disadvantages with the stochastic approximation (SA) method, 
extending the framework to other information deficient problems, and attempting to 
develop algorithms with faster convergence based on this framework. 


Acknowledgements This work was supported by the National Natural Science Foundation of 
China (62173333), Beijing Natural Science Foundation (Z210002), and Research Fund of Renmin 
University of China (2021030187). 


References 


Aarnoudse, L., & Oomen, T. (2020). Model-free learning for massive MIMO systems: Stochastic 
approximation Adjoint iterative learning control. IEEE Control Systems Letters, 5(6), 1946-1951. 

Aarnoudse, L., & Oomen,T. (2021). Conjugate gradient MIMO iterative learning control using 
data-driven stochastic gradients. In 2021 60th IEEE Conference on Decision and Control (CDC) 
(pp. 3749-3754). 

Allen-Zhu, Z. (2018). Katyusha: The first direct acceleration of stochastic gradient methods. Journal 
of Machine Learning Research, 18, 1-51. 

Arimoto, S., Kawamura, S., & Miyazaki, F. (1984). Bettering operation of robots by learning. The 
Journal of Intelligent and Robotic Systems, 1(2), 123-140. 

Bottou, L., Curtis, F. E., & Nocedal, J. (2018). Optimization methods for large-scale machine 
learning. SIAM Review, 60(2), 223-311. 

Ghadimi, S., & Lan, G. (2013). Stochastic first- and zeroth-order methods for nonconvex stochastic 
programming. SIAM Journal on Optimization,23, 2341-2368. 

Gu, P., Tian, S., & Chen, Y. (2019). Iterative learning control based on Nesterov accelerated gradient 
method. IEEE Access,7, 115836-115842. 

Johnson, R., & Zhang, T. (2013). Accelerating stochastic gradient descent using predictive variance 
reduction. Advances in Neural Information Processing Systems, 1, 315—323. 


Iterative Learning Control Based on Random Variance ... 111 


Nesterov, Y. (2005). Introductory lectures on convex programming volume: A basic course (Vol. I). 
Kluwer Academic Publishers. 

Oomen, T., van der Maas, R., Rojas, C. R., & Hjalmarsson, H. (2014). Iterative data-driven H-infinity 
norm estimation of multivariable systems with application to robust active vibration isolation. 
IEEE Transactions on Control Systems Technology,22(6), 2247-2260. 

Owens, D. H., Hatonen, J. J., & Daley, S. (2009). Robust monotone Radient-based discrete-time 
iterative learning control. The International Journal of Robust and Nonlinear Control,19(6), 
634-661. 

Reddi, J. S., Hefny, A., Sra, S., Poczos, B., & Smola, A. (2016). Stochastic variance reduction for 
nonconvex optimization. In Proceedings of the 33rd International Conference on International 
Conference on Machine Learning (Vol. 48, pp. 314—323). 

Shen, D. (2018). Iterative learning control with incomplete information: A survey. /EEE/CAA Jour- 
nal of Automatica Sinica, 5(5), 885—901. 

Yang, X., & Ruan, X. (2017). Reinforced gradient-type iterative learning control for discrete lin- 
ear time-invariant systems with parameters uncertainties and external noises. [MA Journal of 
Mathematical Control and Information, 34(4), 1117-1133. 


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, 
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate 
credit to the original author(s) and the source, provide a link to the Creative Commons license and 
indicate if changes were made. 

The images or other third party material in this chapter are included in the chapter’s Creative 
Commons license, unless indicated otherwise in a credit line to the material. If material is not 
included in the chapter’s Creative Commons license and your intended use is not permitted by 
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from 
the copyright holder. 


A Generalization of NTRUEncrypt A) 


Check for 
updates 


Zheng Zhiyong, Liu Fengxia, Huang Wenlin, Xu Jie, and Tian Kun 


Abstract The main purpose of this chapter is to give a more general construction 
of NTRU based on ideal matrices and q-ary lattice theory. To understand our con- 
struction, first we discuss a more general form of the ordinary cyclic code, namely 
$-cyclic code, which firstly appeared in (Lopez-Permouth et al., 2009; Shi et al., 
2020); thus, we give a more generalized NTRUEncrypt from replacing finite field 
by real number field R. 


Keywords 4ó-cyclic code * Ideal matrices * Convolutional modular Lattice - 
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1 ¢-Cyclic Code 


Let F} be a finite field with q elements and q be a power of a prime number, F; [x] 
be the polynomial ring of F} with variable x. Let F7 be the n-dimensional linear 
space over F}, and a = (a9, 43, ..., An—1) € pr be a fixed vector in B with ao Æ 0, 
the associated polynomial of a given by 


P(x) = pa (x) = x" — a, ix" | — +++ — aix — ao € Fj[x], ao £0. (1) 


Z. Zhiyong - L. Fengxia - H. Wenlin - X. Jie - T. Kun (8) 

Engineering Research Center of Ministry of Education for Financial Computing and Digital 
Engineering, Renmin University of China, Beijing 100872, China 

e-mail: tkun19891208 @ruc.edu.cn 


Z. Zhiyong 
e-mail: zhengzy @ruc.edu.cn 


L. Fengxia 
e-mail: liu, fx @ruc.edu.cn 


H. Wenlin 
e-mail: wenlin @ruc.edu.cn 


X. Jie 

e-mail: xujie0665 @ruc.edu.cn 

© The Author(s) 2023 113 
Z. Zheng (ed.), Proceedings of the Second International Forum on Financial Mathematics 


and Financial Technology, Financial Mathematics and Fintech, 
https://doi.org/10.1007/978-981-99-2366-3 6 


114 Z. Zhiyong et al. 


Let < $ (x) > be the principal ideal generated by @(x) in F;[x]. There is a one to 
one correspondence between F7 and the quotient ring R = F;[x]/ < $(x) >, given 
by 


C= (C0, C1, +; Cn=1) € FR € c(x) = co + cix o bex" ER. (2) 


In fact, this correspondence is also an isomorphism of Abel groups. One may extend 
this correspondence to subsets of F7 and R by 


Cc F} 2 Cœ) = {e)l € C} CR. (3) 


If C C F} isa linear subspace of F7 of dimension k, then C is called a linear code in 
coding theory and written by C = [n, k] as usual. Each vector c = (co, c1, ..., Cn—1) € 
C is called a codeword of length n. Obviously, C — [n, 0] and C — [n, n] are two 
trivial codes. Another one is called constant codes, of which is almost trivial given 
by 

C = {(b, b, ..., b)|b € Fa}, and C = [n, 1]. 


According to the given polynomial $ (x) = a(x), we may define a linear transfor- 
mation r in F, 2 


to(c) = ta ((co, C1, «+s Cn—1)) 


= (oCn—1, Co + d1€n—1, C1 + d2€n—1, «++, Cn-2 + n 1Cn-1). (4) 


It is easily seen that ty : F7 — F7 is a linear transformation. 


Definition 1 Let C C F7 be a linear code. It is called a $-cyclic code, if 
Vc € C > tlc) € C. (5) 


In other words, a linear code C is a $-cyclic code, if and only if C is closed under 
linear transformation tg. Clearly, ifa = (1, 0, ..., 0), and ġa (x) = x" — 1, then the $- 
cyclic code is precisely the ordinary cyclic code (see this chapter of Lopez-Permouth 
et al. (2009)). 


Remark 1 The ¢-cyclic code we give here is polycyclic code in fact, which firstly 
appeared in (Lopez-Permouth et al., 2009; Shi et al., 2020), but we mainly concern 
for its application to McEliece and Niederriter's cryptosystems. We first show that 
there is a one to one correspondence between ¢-cyclic codes in F7 and ideals in 
R= F,[x]/ < (x) >. 


Theorem 1 Let C C F; be a subset, then C is a $-cyclic code, if and only if C(x) 
is an ideal of R. 
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Proof We use column notation for vector in F7, then linear transformation tọ may 
be written as 


Co a0Cn-—1 Co 
Ci Co + d1Ca—1 Ci " 
To = , y Vc = € E, . 
Cn—1 Cn—2 + dn-1€Cn-1 Cn—1 


Let Ty be an x n square matrix over Fy, 


(6) 


where J,_; is the (n — 1) x (n — 1) unit matrix. The matrix expression of Ty is as 
follows: 


C0 co a0Cn-1 
Ci 5 Ci Co + d1Cs—1 (7) 
To , = 1$ : = : . 
Cn—1 Cn-1 Cn—2 + An—-1Cn-1- 


Suppose C C F7 and C(x) is an ideal of R, it is clear that C is a linear code of F7. 
To prove C is a $-cyclic code, we note that for any polynomial c(x) € C(x), then 
xc(x) € C(x) if and only if tg(c) € C, namely, if c(x) € C(x), then 

xc(x) € C(x) & t (c) € C e Tsc e C. (8) 
Therefore, if C (x) is an ideal of R, then we have immediately that C is a $-cyclic 


code of F gi 
Conversely, if C C F P is a $-cyclic code, then for all k > 1, we have 


VceC > Tic e Ck z 1. 
It follows that 


Vc(x) € C(x) > x'c(x) e C(x), 0 kn- I, 


which implies C (x) is an ideal of R. This is the proof of Theorem 1. 


By Theorem 1, to find a @-cyclic code, it is enough to find an ideal of R. There 
are two trivial ideals C(x) = 0 and C (x) = R, the corresponding $-cyclic codes 
are C = [n, 0] and C = F7, respectively, which are called trivial $-cyclic code. To 
find non-trivial $-cyclic codes, we make use of homomorphic theorems, which is a 
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standard technique in Algebra. Let x be the natural homomorphism from F,[x] to 
its quotient ring R = F,[x]/ < $(x) >, kern =< $(x) >, 


< $x) >C NC F,[x] > R= F,[x]/ < ó(x) >, (9) 


where N is an ideal of F,[x], of which is containing kerm =< ¢(x) >. Since F,[x] 
is a principal ideal domain, then N =< g(x) > is a principal ideal generated by a 
monic polynomial g(x) € F;[x]. It is easy to see that 


< P(X) >CK< g(x) >> gx) x). 
It follows that all ideals N satisfying (1.9) are given by 
{< g(x) > | g(x) € F,[x] is monic and g(x)|$ (x)]. 
We write by < g(x) > mod $ (x), the image of < g(x) > under x, itis easy to check 


< g(x) > mod $(x) = {h(x) g(x) | h(x) € F;[x] and degh(x) + degg(x) < n}, 

(10) 

more precisely, which is a representative elements set of < g(x) > mod (x), by 
homomorphism theorem in ring theory, all ideals of R given by 


{< g(x) > mod $(x) | g(x) € F,[x] is monic and g(x)|$ (x)]). (11) 
Let d be the number of monic divisors of $ (x) in F,[x], it follows immediately that 
Corollary 1 The number of $-cyclic code in F7 is d. 
To compare the $-cyclic code and ordinary cyclic code, we see a simple example. 


Example 1 Constant code C is always a cyclic code for 1 + x +--+ + x"-!|x" — 1, 
and its generated polynomial is just 1 + x 4- - -- + x". But constant code C in F d 
is not always a $-cyclic code, it is a $-cyclic code if and only if 1 4- x 4---- 4 
x"-!|ó (x), an equivalent condition for 1 + x + --- + x^^! |ó (x) is 


dj] = dg—2 —---— a4 = b, and aọ = 1 + b. 


Definition 2 Let C be a -cyclic code and C(x) = g(x) mod $ (x). We call g(x) as 
the generated polynomial of C, where g(x) is monic and g(x)|@(x). 


Lemma Let g(x) = go + gix +--+ pup p +x" be the generated 
polynomial of a $-cyclic code C, where 1 X kzxn-—1, and g(x)|$(x), then 
C — [n, k], and a generated matrix for C is the following block matrix: 
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g 
Te (8) 


us 1} (8) (12) 


c) (8) 


kxn 


where g = (80, 81, vey 8n—k-1> 1,0, ...,0) € C is the corresponding codeword of 
g(x), and Ti (8) = 15 Gg (g)) fori & i & n — 1. 


Proof By assumption, C(x) =< g(x) > mod $ (x), then {g, vo (8). .... tj (8) G 
C, we are to prove it is a basis of C. First, these vectors are linearly independent. 
Otherwise, we have 


k-1 
bu = 0, for some b; € F}, (13) 
i=0 


and the corresponding polynomial is zero, namely 


k-1 
(x 2) g(x) 2 0. 


i=0 
It follows that 


k-1 
y dix! = 0 > b = Oforalli,0 <i <k-1. 
i=0 


Next, ifc € C, and c(x) € C(x), by (1.10), there is a polynomial b(x) = bo + bix + 
o by ax? + x*-! such that 


k-1 
c(x) = D(x) g(x) = p e) g(x), where b, | = 1. 


i=0 


Thus, we have the corresponding codeword of C(x) 


k-1 
c= » biti (g). 
i=0 


This shows that {g, v5 (9), .... U (g)} is a basis of C, and a generated matrix for C 
is 
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$8 
te (g) 
G= T5 (g) 


c) (8) 


kxn 


We have Lemma 1 at once. 


To describe a parity check matrix for a $-cyclic code, for any c = (co, c1, ..., 
Cn-1) € F”, we write 


— n 
c= (Ch-1; Cy—25 -+ C1, co) € F; . 


Lemma 2 Suppose C is a $-cyclic code with generated polynomial g(x), where 
g(x)|$ (x) and degg(x) =n — k. Let h(x)g(x) = $(x), where h(x) = ho + hix + 
e hgax*! + x*. Then a parity check matrix for C is 


i 
i 
m (14) 


m T 
tih) 


(n—k)xn 


Proof Sinceh(x)g(x) = $(x),itmeansthath(x)g(x) = Oin R = F;I[x]/ < d(x) >, 
thus we have 


gohi + gihi-y +--+ + guokhi guy = 0, VO & i n— 1. 


It follows that GH’ = 0, where G is a generated matrix for C given by (1.12). 
Therefore, H is a parity check matrix for C. 


A separable polynomial in Algebra means that it has no multiple roots in its 
splitting field. The following lemma shows that there is an unit element in any non- 
zero ideal of R, when $ (x) is a separable polynomial. 


Lemma 3 Suppose $ (x) is a separable polynomial of F}, and C(x) = g(x) mod 
(x) is an ideal of R with degg(x) < n — 1, then there exists an element d(x) € C(x) 
such that 

c(x)d(x) — c(x), for all c(x) € C(x). 


Proof Let h(x)g(x) = $ (x). Since $ (x) is a separable polynomial, then gcd(g (x), 
h(x)) = 1, and there are two polynomials a(x) and b(x) in F;[x] such that 


a(x)g(x) + b(x)h(x) = 1. 


Let 
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d(x) = a(x)g(x) = 1 — b(x)h(x) e C(x). 
If c(x) € C(x), by (1.10), we write c(x) = g(x)g (x), it follows that 
c(x)d(x) = a(x)g(x)g(x)gi(x) = Q0 — bx )h(x))g (x) 81x) 


= g(x)gi(x) = c(x)(mod $(x)). 


Thus, we have c(x)d(x) = c(x) in R. 


Next, we discuss maximal $-cyclic code. Let C(x) = g(x) mod $ (x), and g(x) 
be an irreducible polynomial in F;[x], we call the corresponding $-cyclic code C a 
maximal $-cyclic code, because < g(x) > is a maximal ideal in F; [x]. 


Lemma 4 Let C be a maximal $-cyclic code with generated polynomial g(x), B be 
a root of g(x) in some extensions of F}, then 


C(x) = {a(x) | a(x) e R and a(B) = 0}. (15) 


Proof If a(x) € C(x), by (1.10) we have a(f) = 0 immediately. Conversely, if 
a(x) € F,[x] and a(8) = 0, since g(x) is irreducible, thus we have g(x)|a(x), and 
(1.15) follows at once. 


An important application of maximal $-cyclic code is to construct an error- 
correcting code, so that we may obtain modified McEliece-Niederriter's cryptosys- 
tem. To do this, let 1 < m < /n, and Fj» be an extension field of F} of degree m. 
Suppose Fy» = F,(0), where 0 is a primitive element of Fj» and F} (0) is the simple 
extension containing F} and 0. Let g(x) € F;[x] be the minimum polynomial of 0, 
then g(x) is an irreducible polynomial of degree m of F,[x]. It is well-known that 
Fyn is a Galois extension of F}, so that all roots of g(x) are in Fyn. Let B1, B», ..., Bm 
be all roots of g(x), the Vandermonde matrix V (61, £5, ..., Bm) defined by 


1 fi Bt ses im 
1 po g.- d 


H = V (Pi, Bo, Bm) = : (16) 


2 —1 
1 Bm et Êm mxn 


where f, = 0 and each fj is a vector of (F,)”. For arbitrary monic polynomial 
h(x) € Fy[x], degh(x) = n — m, let (x) = h(x) g(x) and C be a maximal ¢-cyclic 
code generated by g(x). It is easy to verify that 


ceC &cH' «0. 


Therefore, H is a parity check matrix for C. If we choose the primitive element 0, so 


that any d — 1 columns in H are linearly independent, then the minimum distance 
of C is greater than d, and C is a t-error-correcting code, where t = [2]. 
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The public key cryptosystems based on algebraic coding theory were created by 
Lyubashevsky and Micciancio (2006) and Micciancio and Regev (2009), a suitable 
t-error-correcting code plays a key role in their construction. The error-correcting 
code C should satisfy the following requirements: 


(i) C should have a relatively large error-correcting capability so that a reasonable 
number of message vectors can be used; 

(ii) C should allow an efficient decoding algorithm so that the decryption can be 
carried out within a short time. 


Our results supply a different way to choose an error-correcting code by selecting 
arbitrary irreducible polynomials g(x) € F,[x] of degree m and roots of g(x) rather 
than an irreducible factor of x" — 1 and the roots of unit such as ordinary BCH code 
and Gappa code. 

In fact, for any positive integer m, there is at least an irreducible polynomial 
g(x) € F,[x] with degree m. Let N, (m) be the number of irreducible polynomials 
of degree m in F,[x], then we have (see Theorem 3.25 of Lidl and Niederreiter 


(1983) E . 
Ny) = — wu (Z) = — uat. 


d|m d|m 


where u(d) is Mobiiis function. 

Assuming one has selected two monic and irreducible polynomials g(x) and 
h(x) with degg(x) = m and degh(x) = n — m, let d(x) = g(x)h(x), then one may 
obtain $-cyclic code C generated by g(x) or h(x), which is more convenient and 
more flexible than the ordinary methods. 


2 A Generalization of NTRUEncrypt 


The public key cryptosystem NTRU proposed in 1996 by Hoffstein, Pipher, and Sil- 
verman is the fastest known lattice-based encryption scheme, aigue its description 
relies on arithmetic over polynomial quotient ring Z[x]/ « x" — 1 >, it was easily 
observed that it could be expressed as a lattice-based cryptosystem (see IEEE Com- 
puter Society (2000)). For the background materials, we refer to (Coppersmith & 
Shamir, 1997; Hoffstein et al., 1998, 2017; Lint, 1999; McEliece, 1978; Micciancio, 
2001). Our strategy in this section is to replace Z[x]/ « x" — 1 > by a more gen- 
eral polynomial ring Z[x]/ < @(x) > and obtain a generalization of NTRUEncrypt, 
where $ (x) is a monic polynomial of degree n with integer coefficients. 
In this section, we denote $ (x) and R by 


b =x" — an- 1x” E PT aix ao Z[x], 


x]/ < $x) >, ao #0. (17) 
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Let Hy € Z"*" be a square matrix given by 


$ (18) 


nxn 


where /,,_; is (n — 1) x (n — 1) unit matrix. Obviously, $ (x) is the characteristic 
polynomial of H, and H defines a linear transformation of R” — R” by x — Hx, 
where IR is real number field and x is a column vector of IR". We may extend this 
transformation to R?” and denote o by 


o A = En , where (5) eR”. (19) 


Of course, ø is again a linear transformation of R” — R”. 
According to Micciancio (2001), a q-ary lattice is a lattice L such that q Z” C 
L C Z", where q is a positive integer. 


Definition 3 A q-ary lattice L is called convolutional modular lattice, if L is in even 
dimension 2z satisfying 


a a Ha 
M eL>o - EL. 20 
(5) (5) = (is) = 
In other words, a convolutional modular lattice is a q-ary lattice in even dimension 
and is closed under the linear transformation o. 


f 


Recalling the secret key ) of NTRU isa pair of polynomials of degree n — 1, we 


may regard f and g as column vectors in Z”. To obtain a convolutional modular lattice 
containing : , we need some help of ideal matrices. An ideal matrix generated by 


a vector f is defined by 
H*(f) = AGA) = Uf, Hf, B? f, ..., H" f oa, Q1) 


which is a block matrix in terms of each column H* fO<k<n-1). Itis easily 
seen that H*(f) is a generalization of the classical circulant matrices (see Davis 
(1994)), in fact, let @(x) = x" — Ll, and f(x) = fot fix +--+ + fi ax"! € Z[x], 
the ideal matrix Hj (f) generated by f is given by 


122 Z. Zhiyong et al. 


fo fui 
fi fo oh 

H*(f) = Ag(f) = p(x) =x" —-1, 
Ta fÍn-2 ad fo 


which is known as a circulant matrix. On the other hand, ideal matrix and ideal lattice 
play an important role in Ajtai's construction of a collision resistant Hash function, 
the related materials we refer to (Ajtai, 1996; Ajtai & Dwork, 1997; Lint, 1999; 
Niederreiter, 1986; Plantard & Schneider, 2013; Pradhan et al., 2019). 

First, we have to establish some basic properties for an ideal matrix H*( f), most 
of them are known when H*( f) is a circulant matrix. 


Lemma 5 Suppose H and H*(f) are given by (2.2) and (2.5), respectively, then 
for any f € R", we have 
H-.H'(f) = H'(f)-H, Nf em". 


Proof Since $ (x) = x" — aj 1x"! — - -- — a,x — ao is the characteristic polyno- 


mial of H, by the Hamilton-Cayley theorem, we have 


H" = al, +a, H a aH". (22) 
Let 
aj 
a 0 ap 
b= : ETE 3! 
an-1 


By (2.5) we have 


H*(f)H = [f, Hf... Hf) fe jj 


= [Hf, H? f, ..., H”! faf +a Hf a, 4H" f] 
= [Hf, H? f, ..., H^ f, H” f] 


= H[f, Hf, ..., H"! f] = H - H*(f). 


The lemma follows. 


Lemma 6 For any f = . € R" we have 
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H*(f) = fola + fhH t foi” |. (23) 


Proof We use induction on n to show this conclusion. If n = 1, it is trivial. Suppose 
it is true for n, we consider the case of n + 1. For this purpose, we write H = H,, 


€1, 2, ..., €n the n column vectors of unit in IR", namely 

1 0 0 

0 1 0 

e = .€2 = i en = , 

0 0 1 

and 
_ (0 Ao 
An = (2 n 


where Ao = (0,0, ..., ag) € R” is a row vector. For any k, 1 < k in — 1, itis easy 
to check that 


k-1 
k k 0 AoH, 
Annex = ey41, Her = ej41 and H,,, = HE : 
€k n 


fo 
fi 
Let f=] : |€R'"*, we denote f' by 
fn-1 
Sn 
fi 
f= n € R”, and f = o. 

f 
fn 


By the assumption of induction, we have 


Hef) =f. Ha fl, HPF] = fila + fHn te + fp. 


Ane (f) = [(?) , Hanı e 24 Ani (3)] 


= fol, + fi Ha +--+ fa Hpya 


It follows that 


We complete the proof of Lemma 2. 
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We always suppose that $ (x) € Z[x] is a separable polynomial and wi, w2, ..., Wn 
are complex number roots of $ (x), of which are different from each other. The 
Vandermonde matrix V, generated by (wi, w2, ..., Wn} is 


1 1 1 
Wi W2 ::* Wn 
V = : . , and det(V,) Æ 0. 
wr! wh! eae wr! 


Lemma 7 Let f(x) = fot fix +--+ fax € R[x], then we have 


A*(f) = Vo diag ( f (w1), f (w2), .... f (Wn)} Vo. (24) 
where diag {f (w1), f (w2), ..., f (w,)) is the diagonal matrix. 


Proof By Theorem 3.2.5 of Davis (1994), for H, we have 
H = V," diag (wi, w2, ..., Wn} Vs. (25) 
By Lemma 2, it follows that 


H*(f) = V5! diag (f (w1), f (wa), «+, fov) Vo- 


Now, we summarize some basic properties for ideal matrix as follows. 


Theorem 2 Let f € R”, g € R” be two column vectors and H*(f) be the ideal 
matrix generated by f, then we have the following: 

(i) H*(f)H*(g) = H*(g) Cf). 

(ii) H*(f)H*(g) = A*(A*(f)g). 

(iii) det (H*(f)) = Wa f (wi). 

(iv) H*(f) is an invertible matrix if and only if $ (x) and f (x) are coprime, i.e. 
ged (6G), f(x) = 1. 


Proof (i) and (ii) follow from Lemma 2 immediately, (iii) and (iv) follow from 
Lemma 3. Here we only give an equivalent form of (ii). Let 


f * g= H*(f)g. (26) 


Then by (ii) we have 
H*(f x 8) = H*(f)H* (8). (27) 
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To construct a convolutional modular lattice containing vector @ l let @ ) € 


zm. (H*( f)) be the transpose of H*( f), and 


f g 
fH gH' 
A 2 [GI*COY Qo] 2 | SAY s any (28) 
fry g (Hy! ON 
Q (BY (£ Bf Hf 
A mE puo = V Hg "TS R . (29) 


We consider A and A’ as matrices over Z4, ie. A € zem Arg Tet a q-ary 
lattice A, (A) is defined by (see Micciancio (2001)) 


Aqg(A) = {ty € Z?” | there exists x € Z” > y = A'x(mod q)}. (30) 
Under the above notations, we have the following. 


Theorem 3 For any column vectors f € Z” and g € Z”, then ^4(A) is a convolu- 


tional modular lattice, and (4) € A«(A). 


Proof lt is known that A, (A) is a q-ary lattice, i.e. 
qz” C Ky (A) C zm. 


We only prove that A, (A) is fixed under the linear transformation o given by (2.4). 
If y € A4(A), then y = A'x (mod q) for some x € Z”, by Lemma 2, we have 


|. (HH*(f)xY. (H*(f)Hx 
d i Es E es] 


= A'Hx(mod q). 
It means that o (y) € Ag (A) whenever y € A, (A). Let 


1 


0 
e= |. | € Z” = H'(f)e — f, and H*(g)e = g. 
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We have b ) € A,(A), and Theorem 3 follows. 


Since A,(A) C Z^". then there is a unique Hermite Normal Form of basis N, 
which is a upper triangular matrix given by 


N= D P 2 where h = (H* (f) g(mod q). G1) 


Next, we consider parameters system of NTRU. To choose the parameters of 
NTRU, let d y be a positive integer and ( p, 0, — p)" C Z” bea subset of Z”, of which 
has exactly d'y + 1 positive entries and d; negative ones, the remaining n — 2d; — 1 
entries will be zero. We take some assumption conditions for the choice of parameters 
as follows: 


(i) $(x) = x" — a, 1x"! — - - - — aix — ag € Z[x] with ap Æ 0, and $ (x) is sep- 
arable polynomial, n, p, q, d y are positive integers with n prime, 1 < p < q and 


gcd (p, q) — 1. 
(i) f(x) and g(x) are two polynomials in Z[x] of degree n — 1, the constant term 
of f (x) is 1, and 


f(x) —1€ {p,0, —p}", g € {p,0, —p}". 


(iii) H*(f) is invertible modulo q. 
(iv) dy < (— D/Ap — 5. 


Under the above conditions, by Lemma 2, we have 
H*(f) = I,(mod p), and H*(g) = O(mod p). (32) 


Now, we state a generalization of NTRU as follows. 


Private key. The private key in generalized NTRU is a short vector : € Z”, The 


lattice associated with a private key is ^; (A), which is a convolutional modular 
lattice containing a private key. 

Public key. The public key of the generalized NTRU is the HNF basis N of ^4 (A), 
which is given by (2.15). 

Encryption. An input message is encoded as a vector m € (1, 0, — 1)" with exactly 
dy + 1 positive entries and d; negative ones. The vector m is concatenated with a 
randomly chosen vector r € (1, 0, —1)" also with exactly d; + 1 positive entries 


; F m 
and dy negative ones, to obtain a short error vector ( ) € (1,0, — 197. Let 


QCC mo m 
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where h is given by (2.15). Then, the n-dimensional vector c 
c zm + H* (h)r (mod q) 


is the ciphertext. 

e Decryption. Suppose the entries of n-dimensional vector c belong to interval 
[-$, 1], then ciphertext c is decrypted by multiplying it by the secret matrix 
H*(f) mod q, it follows that 


H'(f)c = H*(f)m + H*(f)H*(h)r = H*(f)m + H*(g)r(mod q). (34) 
Here, we use the identity (ii) of Theorem 2, namely 
H'Cf)H'(g) = H*(H*(f)8). 


If the above conditions (iv) are satisfied, it is easily seen that the coordinates 
of vector H*( f)m + H*(g)r are all bounded by 4 in absolute value, or, with high 
probability, even for larger value of dy. The decryption process is completed by 
reducing (2.18) modulo p, to obtain 


H*(f)m + H*(g)r = mI,(mod p). 


Thus, one gets plaintext m from ciphertext c. 
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Cyclic Lattices, Ideal Lattices, (f 
and Bounds for the Smoothing Parameter | i" 


Zheng Zhiyong, Liu Fengxia, Lu Yunfan, and Tian Kun 


Abstract Cyclic lattices and ideal lattices were introduced by Micciancio (2002), 
Lyubashevsky and Micciancio (2006), respectively, which play an efficient role in 
Ajtai's construction of a collision resistant Hash function (see Ajtai (1996), Ajtai and 
Dwork (1997)) and in Gentry's construction of fully homomorphic encryption (see 
Gentry (2009)). Let R = Z[x]/($(x)) be a quotient ring of the integer coefficients 
polynomials ring, Lyubashevsky and Micciancio regarded an ideal lattice as the 
correspondence of an ideal of R, but they neither explain how to extend this definition 
to whole Euclidean space IR", nor exhibit the relationship of cyclic lattices and 
ideal lattices. In this chapter, we regard the cyclic lattices and ideal lattices as the 
correspondences of finitely generated R-modules, so that we may show that ideal 
lattices are actually a special subclass of cyclic lattices, namely, cyclic integer lattices. 
In fact, there is a one to one correspondence between cyclic lattices in R” and finitely 
generated R-modules (see Theorem 4). On the other hand, since R is a Noether ring, 
each ideal of R is a finitely generated R-module, so it is natural and reasonable to 
regard ideal lattices as a special subclass of cyclic lattices (see Corollary 7). It is worth 
noting that we use a more general rotation matrix here, so our definition and results 
on cyclic lattices and ideal lattices are more general forms. As an application, we 
provide a cyclic lattice with an explicit and countable upper bound for the smoothing 
parameter (see Theorem 5). It is an open problem that is the shortest vector problem 
on cyclic lattice NP-hard (see Micciancio (2002)). Our results may be viewed as a 
substantial progress in this direction. 
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1 Discrete Subgroup in IR" 


Let R be the real numbers field, Z be the integers ring, and IR" be Euclidean space of 
which is an n-dimensional linear space over IR with the Euclidean norm |x| given by 


n 3 
i= (Sox) ’ where x' = (xi, x1, , Xn) € R”. 
jst 


We use column vector notation for R” through out this chapter, and x’ = (x1, x2, 
..., Xn) is transpose of x, which is called row vector of IR". 


Definition 1 Let L C IR" be a non-trivial additive subgroup, it is called a discrete 
subgroup if there is a positive real number à > 0 such that 


i mc 1 
amma >0 (1) 


As usual, a ball of center xo with radius ô is defined by 
b(xo, 6) = {x € R” | |x — xol < ê}. 


If L is a discrete subgroup of R”, then there are only finitely many vectors of L lie 
in every ball b(0, ô), thus we always find a vector a € L such that 


lo] = min |x| 24» 0, «eL. Q) 
xeL,xz0 


a is called one of shortest vector of L and A is called the minimum distance of L. 


Let B = [f1, 85, ..., Bm] € R"*" bean x m dimensional matrix with rank(B) = 
m <n, it means that £j, Bo,..., Bm are m linearly independent vectors in IR". The 
lattice L(B) generated by B is defined by 


L(B) = M xfi = {Bx | x € Z"}, Vx; eZ, (3) 
i=l 
which is all linear combinations of 61, 5, ..., Bm over Z. If m = n, L(B) is called 


a full-rank lattice. 
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Itis a well-known conclusion that a discrete subgroup L in R” is justa lattice L(B). 
Firstly, we give a detailed proof here by making use of the simultaneous Diophantine 
approximation theory in real number field IR (see Cassels (1971) and Cassels (1963)). 


Lemma 1 Let L C R” be a discrete subgroup, a, 02,..., 0, € L be m vectors of 
L. Then 0,05, ..., o, are linearly independent over R, if and only if which are 
linearly independent over Z. 


Proof If 04,05, ..., @m are linearly independent over R, trivially which are linearly 
independent over Z. Suppose that o, @2,...,@m are linearly independent over Z, 
we consider arbitrary linear combination over R. Let 


aay + a202 - ::: c ago, — 0, Va; ER. (4) 
We should prove (1.4) is equivalent to aj = a» = --- = am = 0, which implies that 
01, 05, ..., 0, are linearly independent over IR. 

By Minkowski's Third Theorem (see Theorem VII of Cassels (1963)), for any 
sufficiently large N > l,thereareapositiveintegerg > landintegers pi, po, ..., Pm 
Z such that 

max |qai — pil «N ™=, adl <q <N. (5) 


By (1.4), we have 


|p101 + p242 ++++ + Pm&m| = |(qaı — p1ı)&ı + (qa2 — p2)&2 + -- -+ (qam — Pm)&m| 


1 
x mN v max |a;|. (6) 
l<i<m 


Let à be the minimum distance of L, € > 0 be any positive real number. We select 
N such that 
N > max{(—)", (Z)" max |o;|""} 
BO ONS deem v0 


It follows that mN^» < € and 


mN-* max lai] < A. 
l<i<m 
By (1.6) we have 
| pia + P22 a Pin %m| <A. 


Since pio + pao +--+ + pao, € L, thus we have pio + pra + +++ + Pm&m = 
0, and p, = p2 — --- = Pm = Q. By (1.5) we have g|a;| < Lg foralli,l1azi <m. 
Since € is a sufficiently small positive number, we must have a; = a? = --: = am = 
0. We complete the proof of lemma. 
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Suppose that B € R’*” is an n x m-dimensional matrix and rank(B) = m, B’ is 
the transpose of B. It is easy to verify 


rank(B' B) = rank(B) =m = det(B'B) Æ 0, 
which implies that B' B is an invertible square matrix of m x m dimension. Since B'B 
is a positive defined symmetric matrix, then there is an orthogonal matrix P € R”*” 
such that 
P'B'BP = diag(01,05, ..., dn}, (7) 


where 6; > O are the characteristic value of B’ B, and diag{6,, 52, ..., ôm} is the diag- 
onal matrix of m x m dimension. 


Lemma 2 Suppose that B € IR"*" with rank(B) = m, 6), 52, ..., m are m char- 
acteristic values of B' B, and X(L(B)) is the minimum distance of lattice L(B), then 
we have 
XA(L(B) = min |Bx| > V, (8) 
xeZ", xz 


where ô = min(61, 05, ...,0,,]. 
Proof Let A = B'B,by (1.7), there exists an orthogonal matrix P € R”*” such that 
P'AP = diag(01, 85, ..., dn}. 

If x € Z”, x Æ 0, we have 

|Bx|? = x'Ax = x'P(P'AP)P'x 

= (P'x)' diag(81, 02, ..., dm} P'x 
> 8|P'x = d|x|?. 

Since x € Z” and x Æ 0, we have |x|? > 1, it follows that 


xeZ™, x 


min |Bx! > J8|x| 2 Và. 


We have Lemma 2 immediately. 


Another application of Lemma 2 is to give a countable upper bound for smooth- 
ing parameter (see Theorem 5). Combining Lemmas 1 and 2, we show the following 
assertion. 


Theorem 1 Let L C R” be a subset, then L is a discrete subgroup if and only if 
there is an n x m dimensional matrix B € IR"*" with rank(B) = m such that 
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L=L(B)={Bx |x eZ"). (9) 
Proof If L C IR" is a discrete subgroup, then L is a free Z-module. By Lemma 1, 
we have rankz(L) = m < n. Let £1, B5, ..., Bm be a Z-basis of L, then 
L= [Sa | ai ez] $ 
i=1 
Writing B = [B1, 2, .... Bm|nxm, then the rank of matrix B is m, and 


L = {Bx | x € Z”} = L(B). 


Conversely, let L(B) be arbitrary lattice generated by B, obviously, L(B) is an 
additive subgroup of R", by Lemma 2, L(B) is also a discrete subgroup, we have 
Theorem 1 at once. 


Corollary 1 Let L C R” bea lattice and G C L be an additive subgroup of L, then 
G isa lattice of R^. 


Corollary 2 Let L C Z” be an additive subgroup, then L is a lattice of R". These 
lattices are called integer lattices. 


According to above Theorem 1, a lattice L(B) is equivalent to a discrete subgroup of 
R”. Suppose L = L(B) is a lattice with generated matrix B € R"*", and rank(B) = 
m, we write rank(L) —rank( B), and 


d(L) = V/det(B' B). (10) 


In particular, if rank(L) = n is a full-rank lattice, then d(L) = |det(B)| as usual. A 
sublattice N of L means a discrete additive subgroup of L, the quotient group is 
written by L/N, and the cardinality of L/N is denoted by |L/N |. 


Lemma 3 Let L C R” bealattice and N C L beasublattice. If rank(N) =rank(L), 
then the quotient group L/N is a finite group. 


Proof Let rank(L) = m, and L = L(B), where B € R"*" with rank(B) = m. We 
define a mapping o from L to Z” by o (Bx) = x. Clearly, o is an additive group 
isomorphism, c (N) C Z” is a full-rank lattice of Z”, and L/N = Z"/a (N).Itisa 
well-known result that 

IZ" [o (N)| = d(o (N)). 


It follows that 
|L/N| = |Z" /o(N)| = d(a(N)). 


Lemma 3 follows. 
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Suppose that Lı C IR", La C R” are two lattices of IR", we define Lı + L2 = 
(a + bla € Li, b € L5). Obviously, Lı + L2 is an additive subgroup of IR", but gen- 
erally speaking, Lı + L» is not a lattice of IR" again. 


Lemma 4 Let L, C R”, La C R” betwo lattices of R". If rank(L| N L2) —rank(L) 
or rank(L, N L2) =rank(L2), then Lı + L» is again a lattice of R". 


Proof To prove L, + Lz is a lattice of IR", by Theorem 1, it is sufficient to prove 
Lı + La is a discrete subgroup of IR". Suppose that rank(L; N L5) =rank(L,), for 
any x € Lı, we define a distance function p(x) by 


p(x) = infüx — yl | y x. y € Lal. 


Since there are only finitely many vectors in Lz N b(x, 5), where b(x, ô) is any a ball 
of center x with radius ô. Therefore, we have 


p(x)  min(x — y| | y Æ x, y € La} = Ax > 0. (11) 


On the other hand, if x; € Li, x2 € Ly, and x; — x2 € Lo, then there is yo € L2 such 
that x; = x2 + yo, and we have o(x1) = o(x»). It means that p(x) is defined over 
the quotient group Lı + L?/ L5. Because we have the following group isomorphic 
theorem 

L| + Lo/L5 = Lı /L1 A L2. 


By Lemma 3, it follows that 
|Li + L2/L2| = |L1/L1 N L2| < oo. 


In other words, Lı + L2/Lz is also a finite group. Let x1, x2, ... , xy be the repre- 
sentative elements of Lı + L2/L2, we have 


min |x — y| = min p(x) > min(Ay, A, ..., A4) > 0. 
x€Li.yeLo,xzy 1<i<k 


Therefore, Lı + L5 is a discrete subgroup of R”, thus it is a lattice of IR" by 
Theorem 1. 


Remark 1 The condition rank(L; N L5) =rank(L,) or rank(L, N L5) =rank(L2) 
in Lemma 4 seems to be necessary. As a counterexample, we see the real line R, let 
Lı = Zand Lz = V2Z, then Lı + L5 is not a discrete subgroup of R, thus Lı + L2 
is not a lattice in IR. Because Lı + Lo = {n+ J2m|n € Z,m € Z} is dense in R by 
Dirichlet’s Theorem (see Theorem I of Cassels (1963)). 


As a direct consequence, we have the following generalized form of Lemma 4. 
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Corollary 3 Let Li, L2, ..., Lm bem lattices of R^ and 
rank(L4 N Lj N --- N Lm) = rank(L;) for some 1 S j Sm. 


Then L| + La +- -- + Ly isa lattice of R". 


Proof Without loss of generality, we assume that 
rank(L; N L2 O --- (0 Lm) = rank(Lm). 
Let Lj + Lo +--+ Lm-1 = L’, then 
L' + L/L! = Lm/L' N Lm. 


Since rank(L’ N Lm) =rank(Lm), by Lemma 4, we have L’ + Lm = Li + La + 
--- + Ln is a lattice of R” and the corollary follows. 


2 Ideal Matrices 


Let R[x] and Z[x] be the polynomials rings over R and Z with variable x, respectively. 
Suppose that 


b(x) =x" — iix"! — dix — do € ZIx], do #0, (12) 


is a polynomial with integer coefficients of which has no multiple roots in com- 
plex numbers field C. Let wi, w2,..., Wn be the n different roots of $ (x) in C, the 
Vandermonde matrix V; is defined by 


1 1 1 
wi W2 >? Wn 
V = : ; ; , and det(V,) Æ 0. (13) 
wwe mp 


According to the given polynomial $ (x), we define a rotation matrix H = Hy by 


ez", (14) 


Pn-ı nxn 


where /,,_; is the (n — 1) x (n — 1) unit matrix. Obviously, the characteristic poly- 
nomial of H is just $ (x). 
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fo 
1 
We use column notation for vectors in R”, for any f = : € R”, the ideal 
fn-A 
matrix generated by vector f is defined by 
H*(f) = Lf, Hf, H^ f, ..., H" f], e R^, (15) 


which is a block matrix in terms of each column H* f (O<k <n-—1). Sometimes, 
f is called an input vector. It is easily seen that H*( f) is a more general form of the 
classical circulant matrix (see Davis (1994)) and r-circulant matrix (see Shi (2018), 
Yasin and Taskara (2013)). In fact, if $ (x) = x" — 1, then H*(f) is the ordinary 
circulant matrix generated by f. If (x) =x" — r, then H*(f) is the r-circulant 
matrix. 

By (2.4), it follows immediately that 


H*(f +g) = H*(f) + H*(g), and H'(Af) = AH*(f), VA € R. (16) 


Moreover, H*( f) = 0 is a zero matrix if and only if f = 0 is a zero vector, thus one 
has H*(f) = H*(g) if and only if f = g. Let M* be the set of all ideal matrices, 
namely 

M* ={H*(f)| f € R”). (17) 


We may regard H* as a mapping from IR" to M* of which is a one to one correspon- 
dence. 

In Zheng et al. (2023), we have shown some basic properties of ideal matrix, most 
of them may be summarized as the following theorem. 


Theorem 2 Suppose that $ (x) € Z[x] is a fixed polynomial with no multiple roots 
in C, then for any two column vectors f and g in IR", we have 


(i) H*(f) = fol, + fill Tec f 34H75 

(ii) H*(f)H*(g) = H*(H*(f)g) and H'Cf) H'(g) = H*(g) H'Cf); 
(iii) H*(f) = V, ' diagl f (w1), f (e), f (Wa)} Vo: 
(iv) det (H*Cf)) = Hr 4 f wi); 

(v) H*(f) is an invertible matrix if and only if (f (x), $ (x)) = 1 in R[x], 


where Vy is the Vandermonde matrix given by (2.2), w; (1 <i < n) are all roots of 
$ (x) in C, and diag{ f (w1), f (w2), ..., f w4)) is the diagonal matrix. 


Proof See Theorem 2 of Zheng et al. (2023). 


Let e1, €2, ... , e, be unit vectors of R”, that is 
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1 0 0 
1 0 
e-|.Lbe-l]l.[.e6- 
0 0 1 
It is easy to verify that 
H*(e,) = 1,, and H'(e) = H!, 1<k<n. (18) 


This means that the unit matrix J,, and rotation matrices H* (1 € k € n — 1) areall 
the ideal matrices. 

Let $ (x)R[x] and $ (x)Z[x] be the principal ideals generated by $ (x) in R[x] 
and Z[x], respectively, we denote the quotient rings R and R by 


R = Zix]/6 GOZ[x], and R = RIx]/é G)RDs]. (19) 


There is a one to one correspondence between R and IR" given by 


fo 
lem i fi 
f(x) = fot fix +--+ fix Ee R— f= ; e R”. 
Ja-1 
We denote this correspondence by f, that is 
t(f(x)) = f and t^ (f) = f(x), Vf) €R, and f € R". (20) 


If we restrict ¢ in the quotient ring R, then which gives a one to one correspondence 
between R and Z”. First, we show that t is also a ring isomorphism. 


Definition 2 For any two column vectors f and g in R", we define the $-convolutional 
product f x g by f * g = H*(f)g. 
By Theorem 2, it is easy to see that 


f*g-—g*f,and H'(f *g) = H"(f)H"(g). (21) 
Lemma 5 For any two polynomials f (x) and g(x) in R, we have 
t(f@)g@)) = H'(f)g = f * g. 
Proof Let g(x) = go + gix +++: + g, ax" ^! € R, then 


xg(x) = P08n-1 + (go + Pi 8n—1)X ae g (2n—2 + Pn—18n—x" |. 
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It follows that 
t(xg(x)) = Ht(g(x)) = Hg. (22) 


Hence, for any 0 < k € n — 1, we have 
t(x*g(x)) = H*t(g(x)) = H'g, 0€ k & n — 1. Q3) 
Let f(x) = fot fix +--+ fo axe R, by (1) of Theorem 2, we have 


n—i n—i 


FE) = Y. itaga) = Y fiB'g = H*(f)g. 


i=0 i=0 
The lemma follows. 


Theorem 3 Under $-convolutional product, R" is a commutative ring with iden- 
tity element ej and Z" C IR" is its subring. Moreover, we have the following ring 
isomorphisms: 

R S R” S M*, and R & Z” = M}, 


where M* is the set of all ideal matrices given by (2.6), and M7, is the set of all 
integer ideal matrices. 


Proof Let f (x) € R and g(x) € R, then 


t(f(x) + g@)) = f+g=t(f@)) +tgQ@)), 


and 


t(f Gg) = H'Cf)g = f * 8g =t(f)) *t(g@)). 


This means that t is a ring isomorphism. Since f * g = g * f ande; * g = H*(ej)g = 
Ing = g, then R” isacommutative ring with e; as the identity elements. Noting H*( f) 
is an integer matrix if and only if f € Z" is an integer vector, the isomorphism of 
subrings follows immediately. 


According to property (v) of Theorem 2, H*( f) is an invertible matrix whenever 
(f(x), 6(x)) = 1 in R[x], we show that the inverse of an ideal matrix is again an 
ideal matrix. 


Lemma 6 Let f(x) € R and (f(x), $ (x)) = 1 in R[x], then 
(H*(f)! = H* (u), 


where u(x) € R is the unique polynomial such that u(x) f (x) = 1 (mod $(x)). 


Proof By Lemma 5, we have u x f = ej, it follows that 
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H*(u)H* (f) = H*(e) = h. 


Thus we have (H*(f))~'! = H*(u). It is worth to note that if H*( f) is an invertible 
integer matrix, then (H*(f))~' is not an integer matrix in general. 


Sometimes, the following lemma may be useful, especially, when we consider an 
integer matrix. 


Lemma7 Let f(x)e€Z|[x] and (f(x),$(x) —-1 in Z[x] then we have 
(f(x), 6) = Lin Rix]. 


Proof Let Q be the rational number field. Since (f(x), $ (x)) = 1 in Z[x], then 
(f (x), $ (x)) = 1 in Q[x]. We know that Q[x] is a principal ideal domain, thus there 
are two polynomials a(x) and b(x) in Q[x] such that 


a(x) f (x) + b(x)ó (x) = 1. 


This means that (f(x), @(x)) = 1 in R[x]. 


3 Cyclic Lattices and Ideal Lattices 


As we know that cyclic code plays a central role in the algebraic coding theorem (see 
Chap. 6 of Lint (1999)). In Zheng et al. (2023), we extended ordinary cyclic code 
to more general forms, namely $-cyclic codes. To obtain an analogous concept of 
$-cyclic code in IR", we note that every rotation matrix H defines a linear transfor- 
mation of R" by x — Hx. 


Definition 3 A linear subspace C C R” is called a $-cyclic subspace if Va € C > 
Ho € C. A lattice L C R” is called a $-cyclic lattice if Va € L > Ho € L. 


In other words, a $-cyclic subspace C is a linear subspace of IR", of which is 
closed under linear transformation H. A $-cyclic lattice L is a lattice of IR" of which 
is closed under H. If @(x) = x" — 1, then H is the classical circulant matrix and 
the corresponding cyclic lattice first appeared in Micciancio (2002), but he does not 
discuss the further property for these lattices. To obtain the explicit algebraic con- 
struction of $-cyclic lattice, we first show that there is a one to one correspondence 
between $-cyclic subspaces of R” and the ideals of R. 


Lemma 8 Let t be the correspondence between R and R" given by (2.9), then a 
subset C C R” is a $-cyclic subspace of R”, if and only ift! (C) C R is an ideal. 
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Proof We extend the correspondence t to subsets of R and IR" by 
C(x) CR —> C = {ele(x) e C(x)) c R". (24) 


Let C(x) C R be an ideal, it is clear that C C t(C(x)) is a linear subspace of R”. To 
prove C is a $-cyclic subspace, we note that if c(x) € C(x), then by (2.11) 


xc(x) € C(x) & Ht(c(x)) = Ac eC. 
Therefore, if C(x) is an ideal of R, then t(C(x)) = C isa $-cyclic subspace of IR". 
Conversely, if C C IR" is a -cyclic subspace, then for any k > 1, we have H*c € C 
whenever c € C, it implies 
Vc(x) € C(x) > x*c(x) e C(x), 0O&k&n- 1, 
which means that C (x) is an ideal of R. We complete the proof. 


By the above lemma, to find a $-cyclic subspace in IR", it is enough to find an 
ideal of R. There are two trivial ideals C(x) — 0 and C(x) — R, the corresponding 
$-cyclic subspace are C = 0 and C = R”. To find non-trivial $-cyclic subspaces, we 
make use of the homomorphism theorems, which is a standard technique in algebra. 
Let z be the natural homomorphism from R[x] to R, kern = $(x) R[x]. We write 
o(x)R[x] by < $(x) >. Let N be an ideal of R[x] satisfying 

< 6(x) >C N C R[x] — R = R[x]/ < $6) >. (25) 


Since R[x] is a principal ideal domain, then N =< g(x) > is a principal ideal gen- 
erated by a monic polynomial g(x) € R[x]. It is easy to see that 


< $(x) >C< g(x) » e g(x)l$ (x) in R[x]. 
It follows that all ideals N satisfying (2) are given by 
{< g(x) > | g(x) € R[x] is monic and g(x)|$ (x)]. 
We write by < g(x) > mod ¢ (x), the image of < g(x) > under 7, i.e. 
< g(x) > mod $(x) = z(« g(x) >). 
It is easy to check 


< g(x) > mod $ (x) = {a(x) g(x) | a(x) € R[x] and dega(x) + degg(x) < n}, 
(26) 
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more precisely, which is a representative elements set of < g(x) > mod (x). By 
homomorphism theorem in ring theory, all ideals of R are given by 


{< g(x) > mod $(x) | g(x) € R[x] is monic and g(x)|¢(x)}. (27) 


Let d be the number of monic divisors of $ (x) in R[x], we have the following. 


Corollary 4 The number of $-cyclic subspace of R" is d. 


Next, we discuss $-cyclic lattice, which is the geometric analogy of cyclic code. 
The $-cyclic subspace of IR" may be regarded as the algebraic analogy of cyclic 
code. Let the quotient rings R and R be given by (2.8). A R-module is an Abel group 
^ such that there is an operator Aw € ^ forallA e Randa € ^, satisfying 1-a =a 
and (A1A45)a = i; (A2@). It is easy to see that R is a R-module, if ^ C R and ^ isa 
R-module, then A is called a R-submodule of R. All R-modules we discuss here are 
R-submodule of R. On the other hand, if J C R, then Z is an ideal of R, if and only 
if J is a R-module. Let o € R, the cyclic R-module generated by o be defined by 


Ra = {àa |à € R}. (28) 
If there are finitely many polynomials o, 05, ..., o in R such that ^ = Ra; + 
Ran + --- + Rag, then A is called a finitely generated R-module, which is a R- 
submodule of R. 

Now, if L C R” is a $-cyclic lattice, g € IR", H* (g) is the ideal matrix generated 
by vector g, and L(H*(g)) is the lattice generated by H*(g). It is easy to show that 
any L(H*(g)) is a $-cyclic lattice and 

L(H*(g)) C L, whenever g € L, (29) 
which implies that L(H*(g)) is the smallest $-cyclic lattice of which contains vector 


g. Therefore, we call L(H*(g)) is a minimal $-cyclic lattice in IR". 


Lemma 9 There is aone to one correspondence between the minimal $-cyclic lattice 
in R” and the cyclic R-submodule in R, namely, 


t(Rg(x)) = L(H*(g)), forall g(x) € R 


and 
t (L(H*(g))) = Rg(x), forall g € R". 


Proof Let b(x) € R, by Lemma 5, we have 


t (b(x)g(x)) = H*(b)g = H*(g)b € L(H*(8)), 
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andt(Rg(x)) C L(H*(g)). Conversely, ifo € L(H*(g)),anda = H*(g)b for some 
integer vector b, by Lemma 5 again, we have b(x) g(x) € Rg(x), and t(b(x)g(x)) = 
a. This implies that L(H*(g)) C t(Rg(x)), and 


t(Rge(x)) = LT" (g)). 
The lemma follows immediately. 
Suppose L = L(fi, 2, ..., Bm) is arbitrary $-cyclic lattice, where B = [f1, f», 


5 Pnlaxm is the generated matrix of L. L may be expressed as the sum of finitely 
many minimal $-cyclic lattices, in fact, we have 


L = LUI (8) + LOT' (82) t LGT' (85)). (30) 

To state and prove our main results, first, we give a definition of prime spot in IR". 

Definition 4 Let g € IR", and g(x) = t~! (g) € R. If (g(x), $(x)) = 1 in R[x], we 
call g is a prime spot of IR". 

By (v) of Theorem 2, g € IR" is a prime spot if and only if H*(g) is an invertible 


matrix, thus the minimal $-cyclic lattice L(H*(g)) generated by a prime spot is a 
full-rank lattice. 


Lemma 10 Let g and f be two prime spots of IR^, then L(H*(g)) + L(H*(f)) is 
a full-rank $-cyclic lattice. 
Proof According to Lemma 4, it is sufficient to show that 
rank(L(H* (g)) N L(H*(f))) = rank(L(H*(g))) =n. (31) 
In fact, we should prove in general 
L(H*(g)- H'Cf)) c L(A*(g)) n Lr Cf)). (32) 
Since H*(g) - H*(f) is an invertible matrix, then rank(L(H*(g) . H*(f)) — n,and 


(8) follows immediately. 
To prove (9), we note that 


LUI (g)  H'Cf)) = LGT'(g s f). 


It follows that 
£C (LOr*(g) - H*Cf)) = Rea) f(a). 
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It is easy to see that 
Rg(x)f(x) C Rg(x)n RF). 


Therefore, we have 


L(H*(g): H'Cf)) = t(Rg(x) f(x) C LUI (9) N LUI" Cf). 
This is the proof of Lemma 10. 
It is worth to note that (9) is true for the more general case and does not need the 


condition of prime spot. 


Corollary 5 Let B4, 85, ..., Bm be arbitrary m vectors in IR^, then we have 


L(H* (B H* (B2) +++ H'(85)) C LUT'(8)) A LOT (82) Y O LOT" (Bm)). 
(33) 


Proof If B1, 85, ... , Bm are integer vectors, then (10) is trivial. For the general case, 
we write 


L(H*(Bi) - H'(B5) --- H*(Bm)) = LGT' (By Bo * +++ * Bm)), 


where f, * f» * -- - * Bm is the @-convolutional product, then 


t7! (LOT (B1) -+ + H*(85)) = RB (x) B2(x) ++ B). 


Since 


RBi(x)Ba(x) +++ Bm) C RBE) N RB (x) D A Rem (x). 


It follows that 


L(H* (B H* (B2) +++ H'(85)) C LIT (8) A LUT* (82) 1+ O LOT" (Bn). 
We have this corollary. 
By Lemma 10, we also have the following assertion. 


Corollary 6 Let B1, B2, ..., Bn bem prime spots of R^, then L(H* (81)) + L(T*(B)) + 
--- + L(A*(Bm)) is a full-rank $-cyclic lattice. 


Proof It follows immediately from Corollary 3. 


Our main result in this chapter is to establish the following one to one correspon- 
dence between $-cyclic lattices in IR" and finitely generated R-modules in R. 
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Theorem 4 Let ^ = Ro (x) + Raz(x) +--+ + Ra (x) be a finitely generated R- 
module in R, then t (^) is a Q-cyclic lattice in R”. Conversely, if L C R” is a $-cyclic 
lattice in IR", then t^ (L) is a finitely generated R-module in R, that is a one to one 
correspondence. 


Proof If ^ is a finitely generated R-module, by Lemma 9, we have 


EA) = t(Ro Qo) + +++ + R&m(x)) = LUI" (o1) 
+ L(A*(a2)) + +++ + L(H* (a). 


The main difficulty is to show that t (^) is a lattice of R” , we require a surgery to embed 
t (^) into a full-rank lattice. To do this, let (o; (x), $ (x)) = di (x), dj(x) € Z[x], and 
B(x) = a; (x)/di(x), 1 € i < m. Since $ (x) has no multiple roots by assumption, 
then (8; (x), $ (x)) = 1 in R[x]. In other words, each t (Bj (x)) = fj is a prime spot. 
It is easy to verify Ro; (x) C RB;(x) (1 <i < m), thus we have 


t(A) C LOT (8) + LOT (82) + +++ + LCA" (85)). 


By Corollaries 6 and 1, we have t (^) is $-cyclic lattice. Conversely, if L C R” is a 
$-cyclic lattice of IR", and L = L(fi, Bo, ..., Bm), by (7), we have 


t! (L) = RBi (x) + RB) + +++ + RBm (x), 
which is a finitely generated R-module in R. We complete the proof of Theorem 4. 


As we introduced in abstract, since R is a Noether ring, then J C R is an ideal 
if and only if J is a finitely generated R-module. On the other hand, if Z C R is an 
ideal, then t (7) C Z” is a discrete subgroup of Z”, thus f(/) is a lattice, we define 
the following. 


Definition 5 Let / C R be an ideal, 1 (7) is called the $-ideal lattice. 


Ideal lattice first appeared in Lyubashevsky and Micciancio (2006) (see 
Definition 3.1 of Lyubashevsky and Micciancio (2006)). As a direct consequence 
of Theorem 4, we have the following. 


Corollary 7 Let L C R” be a subset, then L is a $-cyclic lattice if and only if 
L = L(H* (8) + L(A" (82) t L(H* (85), 


where B; € IR" and m <n. Furthermore, L is a $-ideal lattice if and only if every 
BieZ"1xicxm. 


Corollary 8 Suppose that $ (x) is an irreducible polynomial in Z[x], then any non- 
zero ideal I of R defines a full-rank $-ideal lattice t (I) C Z^. 
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Proof Let I C R bea non-zero ideal, then we have J = Ro (x) + Roo(x) +--+ + 
Ra, (x), where o;(x) € R and (œ; (x), $ (x)) = 1. It follows that 


t) = LT (03)) + L(H* (02)) + - -- LOT (2). 


Since each o; is a prime spot, we have rank(t(/)) =n by Corollary 6, and the 
corollary follows at once. 


According to Definition 3.1 of Lyubashevsky and Micciancio (2006), we have 
proved that any an ideal of R corresponding to a $-ideal lattice, which just is a $- 
cyclic integer lattice under the more general rotation matrix H = Hg. Cyclic lattice 
and ideal lattice were introduced in Lyubashevsky and Micciancio (2006), Miccian- 
cio (2002), respectively, to improve the space complexity of lattice-based cryptosys- 
tems. Ideal lattices allow to represent a lattice using only two polynomials. Using 
such lattices, class lattice-based cryptosystems can diminish their space complex- 
ity from O(n’) to O(n). Ideal lattices also allow to accelerate computations using 
the polynomial structure. The original structure of Micciancio's matrices uses the 
ordinary circulant matrices and allows for an interpretation in terms of arithmetic 
in polynomial ring Z[x]/ « x" — 1 >. Lyubashevsky and Micciancio (2006) later 
suggested to change the ring to Z[x]/ < $(x) > with an irreducible $ (x) over Z[x]. 
Our results here suggest to change the ring to Z[x]/ < $(x) > with any polynomial 
(x). There are many works subsequent to Micciancio (2002, Lyubashevsky and 
Micciancio (2006), such as (Feige & Micciancio, 2004; Micciancio & Regev, 2009; 
Peikert, 2016; Plantard & Schneider, 2013; Pradhan et al., 2019; Stehle & Steinfeld, 
2011). 


Example 1 It is interesting to find some examples of $-cyclic lattices in an algebraic 
number field K. Let Q be a rational number field, without loss of generality, an 
algebraic number field K of degree n is just K — Q(w), where w — w; is a root of 
$ (x). If all Q(w;) C R (1 <i < n), then K is called a totally real algebraic number 
field. Let Ox be the ring of algebraic integers of K, and J C Ox bean ideal, 7 4 0. 
Since there is an integral basis (o, 62, ..., 0/4) C Z such that 


I = Zai + Zoo 4-4 Zar. 


We may regard every ideal of Ox as a lattice in Q", and our assertion is that every 
non-zero ideal of Ox is corresponding to a full-rank $-cyclic lattice of Q". To see 
this example, let 


n-l 
Qlw] = [Sea € ol. 


i=0 


It is known that K = Q[w], thus every a € K corresponds to a vector w € Q” by 
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ao 
n-l ai 
a=) aw —u- ; c Q” 
E : 
an-1 


IfI C Ox is an ideal of Oy and J = Za, + Zoo +---+ Za,, let B = [a], @,..., 
æn] € Q”*", which is full-rank matrix. We have t(/) = L(B) as a full-rank lattice. It 
remains to show that t(/) is a $-cyclic lattice, we only prove thatifa € I > Ha c 
t (1). Suppose that œ € 7, then wa € 1. Itis easy to verify that t (w) = e» (see (2.7)) 
and 

t(wa) = t(w) * t(a) = Ha € x(I). 


This means that c (7) is a $-cyclic lattice of Q”, which is a full-rank lattice. 


4 Smoothing Parameter 


As an application of the algebraic structure of $-cyclic lattice, we show an explicit 
upper bound of the smoothing parameter for the $-cyclic lattices. Firstly, we introduce 
some basic notations. 

A Gauss function p; (x) in IR" is given by 


ps (X) = e The (34) 


where x € R",c € R", ands > Oisa positive real number. p; (x) is called the Gauss 
function around original point c with parameter s. It is easy to see that 


IE 
R^ 
Thus, we may define a probability density function D, .(x) by 
Ds (x) = pects! f oss cods = Ps (x) /s" . (35) 
R” 
Suppose L C R” is a lattice, let 
DL) = > Dsc), Pse(L) = Y^ 0502. (36) 
xeL xeL 


The discrete Gauss distribution over L is a probability distribution D; ,. over L 
given by 
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D.) — Ps.e(X) 
DD) Ou 


Dry s«(x) = (37) 
If c= 0 is the zero vector of IR^, we write ps o(x) = s(x), Ps.o(L) = ps(L), 
Ds 0(x) = Ds(x), and Ds o(L) = D,(L). Suppose that L is a full-rank lattice and 
L* is its dual lattice, we define the smoothing parameter 1 (L) of L to be the small- 
est s such that pj, (L*) < 1 + £, more precisely, 


ne(L) = min(s: s > Oand pj5(L*) < 13- €}, (38) 


where £ > 0 is a positive number. Notice that p1/s(L*) is a continuous and strictly 
decreasing function of s, thus the smoothing parameter n4 (L) is a continuous and 
strictly decreasing function of €. 

Let L = L(Bi, B2, ..., Bn) C IR" beafull-rank lattice with a basis B1, b2, .. . , Bn, 
the fundamental region P(L) is given by 


P(L) = 19; aifil0 < a; < 1, iie]. (39) 


i-l 


Suppose that X and Y are two discrete random variables on R”, the statistical distance 
between X and Y over L is defined by 


1 
A(X, Y) = 75) |P(X =a} - P{Y =a}. (40) 


acL 


If X and Y are continuous random variables with probability density function T, and 
Th, respectively, then A(X, Y) is defined by 


1 
AK.) = 5 I ima — naak: (41) 
R” 


The smoothing parameter was introduced by Micciancio and Regev (2007), which 
plays an important role in the statistical information of lattices. An important prop- 
erty of smoothing parameter is for any lattice L = L(B) and any & > 0, the statis- 
tical distance between D, mod L and the uniform distribution over the fundamen- 
tal region P(L) is at most 1 pis (L(B)*)). More precisely, for any € > 0 and any 
s 2 ne (L(B)), the statistical distance is at most je, namely 


A(Ds < mod L, U(P(L))) € (42) 


E 
y 
Lemma 11 Let L C IR" be a full-rank lattice, we have 


no-n(L) € Vn/Ai(L*), (43) 
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where L* is the dual lattice of L, and X1(L*) is the minimum distance of L*. 
Proof See Lemma 3.2 of Micciancio and Regev (2007), or Banaszczyk (1993). 


Lemma 12 Suppose that Lı and L» are two full-rank lattices in R^, and L| C Lo, 
then for any £ > 0, we have 
Ne(L2) < ne(L1). (44) 


Proof Let n, (L1) = s, we are to show that n,(L2) < s. Since 


pis (L1) = 1-2- e, and > grrr’ e, 


xeLi 


It is easy to check that L5 C LT, it follows that 
2|,4|2 21442 
lt+e= 5 e 75 |x| > 3 e 75 |x] . 
xeLi xeL; 


which implies 
p1s (L5) E 1 m E, 


and ne(L2) € $ = ne(L1ı), thus we have Lemma 12. 


According to (2.4), the ideal matrix H*(f) with input vector f € R” is just the 
ordinary circulant matrix when $ (x) = x” — 1. Next lemma shows that the trans- 


80 
: b c ; 81 
pose of a circulant matrix is still a circulant matrix. For any g — ] € R", we 
$n-1 
$n-1 
8n-2 
denote g = . |, which is called the conjugation of g. 
80 
$0 
81 
Lemma 13 Let $ (x) = x" — 1, then for any g = . € R”, we have 
8n-1 
(H*(g))' = H* (H3). (45) 


Proof Since $ (x) = x" — 1, then H = Hg (see (2.3)) is an orthogonal matrix, and 
we have H-! = H”! = H'. We write Hj = H' = H^. The following identity is 
easy to verify 
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H*(g)= 


It follows that 


(H*(g))' = [Hg, H (H3), ..., H"! (Hg)] = H*(H8), 


and we have the lemma. 


Lemma 14 Suppose that g € R” and the circulant matrix H*(g) is invertible. Let 
A = (H*(g)) H* (g), then all characteristic values of A are given by 


(lg), IOD, . . .. 1e (0,013, 


where 67 = 1 (1 <i € n) are the n-th roots of unity. 
Proof By Lemma 13 and (ii) of Theorem 2, we have 
A = H*(Hg)H*g = H*(H*(Hg)g) = H*(g^), 


where g” = H*(Hg)g. Let g’(x) = t^! (g") be the corresponding polynomial of g”. 
By (iii) of Theorem 2, all characteristic values of A are given by 


(2 (01), 8" (6,...,8 (6), OF =1, 1& i & n. (46) 

80 
$1 

Let g = . € IR". It is easy to see that 

n-1 
n-l n-l n-—l 
g(x) = D gi + (Y 2 a mie » T x"! eder, 

i=0 i=0 i=0 

where g_; = g,_; forall 1 < i < n — 1, then the lemma follows at once. 


By definition 4, if g € R" is a prime spot, then there is a unique polynomial 
u(x) € R such that u(x)g(x) = 1 (mod $(x)). We define a new vector T, and its 
corresponding polynomial T, (x) by 


T, = Hu, and T,(x) = t! (Hu). (47) 


If g € Z” is an integer vector, then T, € Z” is also an integer vector, and T, (x) € Z[x] 
is a polynomial with integer coefficients. Our main result on smoothing parameter 
is the following theorem. 
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Theorem 5 Let $ (x) = x" — 1, L C R” beafull-rank $-cyclic lattice, then for any 
prime spots g € L, we have 


no-n(L) < J/n(min(|T, (01)], |T2(2)], . 174 06)]D (48) 
where 07 = 1, 1 <i < n, and T, (x) is given by (4.14). 
Proof Let g € L be a prime spot, by Lemma 12, we have 
L(H*(g)) C L > ne(L) < ne(L(A"(g))), Ve > 0. (49) 
To estimate the smoothing parameter of L(H*(g)), the dual lattice of L(H*(g)) is 
given by 
L(H*(g)* = L((A*(u))’) = L(H*(Hu)) = L(H* (T3), 


where u(x) € R and u(x)g(x) = 1 (mod x" — 1), and T, is given by (4.14). Let 
A = (H*(T,)) H*(T,), by Lemma 14, all characteristic values of A are 


(TODI, IT 2), ~- x Te 2). 
By Lemma 2, the minimum distance A; (L(H*(g))*) is bounded by 
A1GLUT* (9))*) > min(|T; (61)], |T? (62)], . ... T4 (8,)]). (50) 


Now, Theorem 5 follows from Lemma 11 immediately. 


Let L = L(B) be a full-rank lattice and B = [£1, 85, ..., Bn]. We denote by 
B* = [Bt. B5, ..., B5] the Gram-Schmidt orthogonal vectors (£7) of the ordered 
basis B = (fjj). It is a well-known conclusion that 


Ai.) > |B*| = mi j 
1(L) > |B*| zn. [Rs 


which yields by Lemma 11 the following upper bound 
no-n(L) € An|Bg| *, (51) 


where Bj is the orthogonal basis of dual lattice L* of L. 
For a $-cyclic lattice L, we observe that the upper bound (4.17) is always better 
than (4.18) by numerical testing, we give two examples here. 


Example 2 Letn = 3 and ¢(x) = x? — 1, the rotation matrix H is 


001 
H=1{100 
010 
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We select a $-cyclic lattice L — L(B), where 
1 

B= {011 

001 


Since L = Z?, thus L is a $-cyclic lattice. It is easy to check 


J3 
|B5| = min |g7| 2 —. 
1<i<3 


3 


0 
On the other hand, we randomly find a prime spot g = | 0 | € L and g(x) = x’. 
1 


Since xg(x) = 1 (mod x? — 1), we have T,(x) = x’, it follows that IT,(0,)| = 
IT; (802)| = |T; (85)| = 1, and 


e ETE! < *j=l : 
min, I0] < [BSL ! = V3 


Example 3 Let n = 4 and $ (x) = x^ — 1, the rotation matrix H is 


0001 
1000 
0100 
0010 


We select a $-cyclic lattice L — L(B), where 


1111 
0111 
0011 
0001 


Since L = Z^, thus L is a $-cyclic lattice. It is easy to check 


1 


B; = min * = 
Bj = min 16" = 5 
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—2 

On the other hand, we randomly find a prime spot g = 7 € Land g(x) = x — 
0 

2. 2 Gx? — Ix? — 2x — Dga) = = 1 (mod x^ — 1), we have T,(x) = — x- 


2 
7* 
lx? 4 1x — 5, it follows that |7,(8,)| = 1, IT, (62) = |T,(63)| = ITs) = 3, nd 


min |7,(65,) | = = «|B;p| 1 22 
Eu 2 (9) < |Bo| 


alr 
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On the LWE Cryptosystem with More ®) 
General Disturbance gek 


Zheng Zhiyong and Tian Kun 


Abstract The main purpose of this chapter is to give an extension on learning with 
errors problem (LWE)-based cryptosystem about the probability of decryption error 
with more general disturbance. In the first section, we introduce the LWE cryp- 
tosystem with its application and some previous research results. Then we give a 
more precise estimation probability of decryption error based on independent identi- 
cal Gaussian disturbances and any general independent identical disturbances. This 
upper bound probability could be closed to 0 if we choose applicable parameters. It 
means that the probability of decryption error for the cryptosystem could be suffi- 
ciently small. So we verify our core result that the LWE-based cryptosystem could 
have high security. 


Keywords Learning with errors problem * Decryption error * Probability - 
General disturbance 


1 Introduction 


In this section, we describe a cryptosystem based on the learning with errors problem 
(LWE) (Micciancio & Regev, 2009; Regev, 2005). First, we introduce the LWE 
problem. Let p be a prime number, m, n be positive integers and consider a list of 
equations with error as follows: 
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< $,aj 72, vı (mod p), 
< 8,42 7:2, v (mod p), 


< S, Am 72, Vm (mod p). 


Here s € Zi» 41, @2, ..., Am are chosen independently and uniformly from Z^, and 
Vi, V2, ..., Vm € Zp. < s, d; > is the inner product of two vectors s and a;. The 
errors in these equations are generated from a probability distribution x : Zp > Rt 
on Zp, i.e. for each equation, we have v; =< s, a; > +e; and e; € Zp is chosen 
independently based on the probability distribution x. The problem of finding s € Z% 
from such equations is called LWE, ,. There is an equivalent description for the 
LWE problem. The input has a pair (A, v) where A € Z5"" is chosen uniformly, 
and the choices of v have two cases. One case for v is chosen uniformly from Z5, 
the other case is As + e for a uniformly chosen vector s € Z}, and vector e € Z% 
chosen according to x”. The goal is to distinguish between these two cases with 
non-negligible probability. It is also equivalent with a decoding problem in q-ary 
lattices (Regev, 2005). 

The short integer solution (SIS) problem was first introduced in the seminal work 
of Ajtai (1996) and has served as the foundation for one-way and collision-resistant 
hash functions, identification schemes, digital signatures, and other “minicrypt” 
primitives. A very important work of Regev from 2005 introduced the LWE problem, 
which is the “encryption-enabling” analogue of the SIS problem (Regev, 2009). In 
fact, the two problems are very similar and can meaningfully be seen as duals of 
each other. 

The LWE problem is a very robust problem and can be viewed as an extension of 
a well-known problem in learning theory. It remains hard even if the attacker learns 
extra information about the secret and errors. Regev gave the worst-case hardness 
theorem for LWE (Regev, 2009). The complexity of the best-known algorithm is 
running in exponential time in n (Ajtai et al., 2001; Blum et al., 2003; Kumar & 
Sivakumar, 2001). This theorem is proved by giving a quantum polynomial-time 
reduction that uses an oracle for LWE to solve GapSVP, and SIVP, in the worst 
case, thereby transforming any algorithm that solves LWE into a quantum algorithm 
for lattice problems. The quantum nature of the reduction is meaningful since there 
are no known quantum algorithms for GapSVP,, and SIVP, that significantly out- 
perform classical ones, beyond generic quantum speedups. It would be very useful 
to have a completely classical reduction to give further confidence in the hardness 
of LWE, which was given in 2009 by Peikert (2009). Regev also gave a public-key 
cryptosystem whose semantic security can provably be based on the LWE prob- 
lem, and hence on the conjectured quantum hardness of GapSVP,, and SIVP; for 
y = O (n?) (Regev, 2009). LWE problem has a close relationship with decoding 
problems in coding theory (Ajtai, 2005; Ajtai & Dwork, 1997; Alekhnovich, 2003; 
Asokan et al., 2007; Ding, 2004; Kawachi et al., 2007; Peikert, 2007; Peikert et al., 
2008; Regev, 2004; Signing et al., 2022). Regev's cryptosystem is secure against 
passive eavesdroppers since the LWE problem is hard. 
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Another application of LWE is fully homomorphic encryption (FHE) (Rivest et al., 
1978). The earliest FHE constructions were based on average-case assumptions about 
ideal lattices (Gentry, 2009; Dijk et al., 2010). Later, Brakerski and Vaikuntanathan 
gave the second generation of FHE constructions, which were based on the LWE 
problem (Brakerski & Vaikuntanathan, 201 1a, b). In 2013, Gentry, Sahai, and Waters 
proposed an LWE-based FHE scheme that has some unique and advantageous prop- 
erties, such as homomorphic multiplication does not require any key-switching step, 
and the scheme can be made identity-based. This yields unbounded FHE based on 
LWE with just an inverse-polynomial n 9 error rate (Gentry et al., 1999). 

Now we introduce the efficient lattice-based cryptosystem in the following which 
has strong theoretical security (Micciancio & Regev, 2009). 


Private key: S € Ze is uniformly chosen at random. 


Public key: A € Z7"" is uniformly chosen at random and E € es is chosen 
from the distribution y,,. The public key is (A, P = AS + E). 

Encryption: Given v € Z! from the message space and a public key (A, P), choose 
a vector a € (—r, —r + 1, --- , rj" uniformly at random, and compute the cipher- 
text (u = ATa, c = PTa + f(v)). 

Decryption: Given a ciphertext (u, c) and a private key S, output f^! (c — ST u). 


Here m, n, 1, t, q, r are positive integers and o > 0. V. is defined to be the distri- 
bution on Z, obtained by sampling a normal variable with mean 0 and standard 
deviation æq //2z, rounding the result to the nearest integer and reduced modulo 
q. f is defined as the function from Z/ to Zi, by multiplying each coordinate by q /t 
and rounding to the nearest integer. f^! is defined to be the “inverse” mapping of 
f by multiplying each coordinate by t/q and rounding to the nearest integer. The 
definitions of f and f7! are in the next section. The probability of decryption error 
in one letter for this cryptosystem is approximatively estimated in (Micciancio & 
Regev, 2009) as 


1 6 
error probability per letter ~ 2| 1 — e uj ; (1) 
2ta V mr(r + 1) 


where ó is the cumulative distribution function of the standard normal distribution, 


t 


ie. Ø (x) = y m Tae 7 dt. We give here a more precise upper bound estimation 


S q—t 60 
error probability < 2/| 1 — ® : (2) 
2atq V mr(r + 1) 


This upper bound probability could be closed to 0 if we choose o small enough. It 
means that the probability of decryption error for the cryptosystem could be suffi- 
ciently small. However, the above estimation is based on Gaussian disturbance. In our 
work, we also give the probability of decryption error for the LWE-based cryptosys- 
tem with more general disturbance. By central limit theorem (Riauba, 1975), general 
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disturbance could be approximated as Gaussian disturbance, then we get the follow- 
ing probability estimation result which is more advanced than that in (Micciancio & 
Regev, 2009). 


—t| 3 
error probability « a o( EG: 5)) +16. (3) 


Here £ is the standard deviation of disturbance distribution, and ô is a positive real 
number. 


1.1 Innovation and Contribution 


Our work gives estimation probability of decryption error based on Gaussian dis- 
turbances and proves that the decryption error could be sufficiently small. The most 
salient innovation and contribution is that for any general disturbances, the decryp- 
tion error could also be small enough. This indicates high security and reliability of 
LWE-based cryptosystem. In other words, this cryptosystem is secure enough against 
passive eavesdroppers and could be applied in many kinds of encryption processes. 


2 Methodology 


2.1 Preliminary Property 


Definition 1 Vx € R, let [x] be the closest integer to x, specially, [x] is defined to 
be x — i if the fractional part of x is L. It is trivial that -i «x-—[x] € i for all 
x eR. 


Lemma 1 : and q are positive integers, t € q. Va € Z, let f (a) = [£a] € Z4. 
Vb € Zq, let fib) = [75] € Z,. Then f (f (a)) = a for Va € Z, holds. 


Remark 1 If a; = a» (mod t), we have f (a1) = f (a2) (mod q), so the definition 
of f is well defined and reasonable. 


Proof of Lemma 1 (1) If tf = q, then we have f (a) = [a] = a and 
f Qf) = f (2) = lal =a, Va € Z. 


(2) If t < q, then Z > j, we know 
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It follows that 


So we can get 


q q F | q 
2T "ur Fe 
This is equivalent to 
1 [2 | n 
—-2« =a|<a+-, 
^7 2 a 2 
and 
1 t [2 | 1 
< a|—a « >. 
2 qlt 2 
Thus, 
t 
[-[Za] — a] = 0, and E [44] —a 
q qit 


This means that 


f (f(a) =a, Ya € Z. 


Lemma 2 t and q are positive integers, t > q. If a is uniformly chosen in Z,, then 
-1 q 
PIF a) A oa 


Proof of Lemma 2 ! > q, from Lemma | we have 


ale-a 
[pp wwe 
jp? G (|<5])) = fb) = Fa | Vb e Zp. 


Here 0, H 3 H Ad [sx | are different from each other in Z;. Next we prove 


that the number of a in Z; satisfying f—'(f(a)) = a is no more than q. Let A be the 
set containing all the elements satisfying f^! (f(a)) = ain Z,. Yaj, a» € A, ai £ a 


This is equivalent to 


So we get 
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in Z;, then we have f'(a1) Æ f (a2) (mod q), i.e. f (a1) Æ f (a2) in Z,. This means 
the number of A is no more than q. 


Above all, it shows that 0, H ; H doa [e] are just all the numbers in Z, 
such that f ^! (f (a)) = a. Based on a is uniformly chosen in Z,, then 


PU (fa) a) - 1-4. 


Corollary 1 t, q, andl are positive integers. Va = (a1,a5,...,a]) € Zi, let f(a) = 
([£a1].[£23].....[£u]) € Z}. Yb = (bı, b2, ..., b) € Zl, let f'(b) = 


(Fa ! E TE [in] € Zl. Ifa is uniformly chosen in Z! and a1, a2,..., 4 


are independent, then 


1 
P(f- (f (a)) # a} = max fo, 1- (£) | l 
Proof of Corollary 1 If t < q, from Lemma 1, we have 
f" (f(a)) = ai, Yai € Z, VI«i«I. 


So 
f^ (f(a)) =a, Ya e Zl. 


l 
P(f- (f(a)) # a} = 0 = max fo, i= (2) | 


t 


If t > q, from Lemma 2, we have 
P(f- (fa) = a) = 2, a eZ, VIISI. 


Since a), a2, ..., a; are independent, therefore, 


i 
P(7(()-a) = (5) , a ez. 


t 


P(f ^ (f(a) £a] =1- (4) — max fo. je: (2) . 


t 
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2.2 Probability of Decryption Error Based on Gaussian 
Disturbance 


Now we can calculate the probability of decryption error for the LWE-based cryp- 
tosystem. As described in the first section, assume S be the private key, (A, P) be 
the public key, and we choose v € Z! from the message space, encrypt v, and then 
decrypt it. The ciphertext is (u = AT a, c = PTa + f (v)). The decryption result is 


f^ (c— S'u) = f (Pla fv) — Stu) 
= f^ (AS + E) a + f(v) — ST ATa) 
= f Ela f(v). 


Here the decryption result f ^! (E7a + f (v)) € Zl. The decryption error occurs if 
f (GET a + f (v)) x v. Since all the parameters are taken to guarantee security and 
efficiency of the cryptosystem, here we set q > f and obtain the following theorem. 


Theorem 1 ¢, q, l, m, r are positive integers and q > t. v € Zl, f is defined in the 
previous section, E,*4 is a Gaussian disturbance matrix with each element chosen 
independently from the Gaussian distribution with mean 0 and standard deviation 
aq/A/21, a € (—r, =r + 1,- -- ,r]" is uniformly chosen at random. Then we have 
the following inequality of the probability of decryption error. 


=f T a i 
P(f as fene n ca e(t em) 


Here ® is the cumulative distribution function of the standard normal distribution, 
i.e. P(x) = 


i He- dt. 

Proof of Theorem 1 In order to compute the probability of decryption error, we 
consider one letter first, i.e. the probability of f (Ela + f(vi)) Æ vi, here vj is 
the ith coordinate of v, Emxı = (Ej, E2, ..., Ej), and fia + f(Gj)) is the ith 
coordinate of £^! (ET a + f (v)). From Lemma 1, we know that f^! (f (vi)) = vj 
for any v; € Z; under this condition. We have 


set oT 1 t 
So if |- E; a| «5— zg» We get 
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fF (Ej a+ fv) = vi. 


It means that if |- E7 a| «i- xp We can get f^ (Ela + f (vj)) = vi. Equivalently, 
if f T(E} a + f(vi)) Æ vi, ie. the decryption error occurs in the ith letter, then 
IEF a| > 5 — D So the probability of decryption error in one letter is no more 


than the probability of | Goi a 2l1-ie 
1 t 
e Ere 
2 2q 


2q 
The next step we estimate the probability of |+ ETa] > 1 — x- Since each coordi- 
nate of E; is chosen independently from the Gaussian distribution with mean 0 and 
standard deviation oq / 4 2x and the sum of independent Gaussian variables is still a 


P(f (ETa + f(vi) # vi} < P [|era 
q 


Gaussian variable, ET a is also a Gaussian distribution variable. a = (a), a2, ..., am) 
and each a; is chosen from {—r, —r + 1, --- , r} uniformly at random, then 
= = 1 sta 
Ead r+(-r4+1)+---4+r E 
2r 4-1 
(=r) +(—r +1 te tr? rerl) 
Var(a;) = = , 
2r 4-1 3 
E(E} a) = 0. 
Var(ET a) = (2) POPE D 2 o?q?mr(r + Ly. 
2x 3 6 


Therefore, E T, a is treated as a normal distribution with mean 0 and standard deviation 


aq mr(r + 1)// 67. We have 


t 1 t —t 
p [|t era > ;-zl = P [ira > d 
q 


2q 2t 
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=P liera / (e ee. 2 x s IM al 
B T imr(r + 1) q-t 6m 
-E iz “eg 6r s 2otq V mr(r + 5| 
=2(1 (d Eu )) 
2atq V mr(r + 1) 


So we get the following inequality for the probability of decryption error of the 
LWE-based cryptosystem 


P(f (ETa + f() zv) 
«IPUf (ETa + fi) z vi} 


-~ 1 t 
27 2q 


=2(ı "(d dd p 
2atq V mr(r + 1) 


This upper bound probability estimation is more precise than (1). The upper bound 
could be as closed as 0 if we choose o small enough. It means that the probability 
of decryption error for the LWE-based cryptosystem could be made very small with 
an appropriate setting of parameters. 


<ir ||: Ela 
q 


2.3 Probability of Decryption Error for General Disturbance 


In this section, we estimate the probability of decryption error for the LWE-based 
cryptosystem when the noise matrix E = (£;;)m xi is chosen independently from a 
general common variable. 


Theorem 2 t, q, l, r are positive integers and q > t, m is a undetermined positive 
integer. v € Ls, f is defined in the second section, Ey; is a general disturbance 
matrix with each element chosen independently from a common random variable of 
mean 0 and standard deviation B, a € (—r, —r + 1,--- ,r}” is uniformly chosen 
at random. For any 5 > 0, we can find positive integer m, such that the following 
inequality of the probability of decryption error holds. 
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= 3 
P(f (E'a* fv) Av} < a1 ó MEM ) +165, 


Here Ó is the cumulative distribution function of the standard normal distribution, 
i.e. P(x) = eu ; dt. 


Jo Jr 


Proof of Theorem 2 Similarly as the proof of Theorem 1, we need to estimate 
the probability of Iz Ej al > i E x Since the coordinates of E7 are independent 


identically distributed, E7 and a are also independent, by central limit theorem 
(Riauba, 1975), E7 a is approximately normal distribution with mean 0 and standard 


deviation d — ym Var(Eij)Var(ai) = B4j nOD Thus, for any sufficiently small 
ô > O, there is a positive integer m such that 


> z-a} =P flera I 
d Da 21 
«rper (lS) spem 
u T wee q-—t 3 
fle al/(e By EE 26r |e) 
u q-—t 3 
=2(1 (aaa) + 


Here |e| < 6. Then we get the following inequality for the probability of decryption 
error of the LWE-based cryptosystem for general disturbance 


{| Ela 
q 


P(f (ETa + f() zv) 
« IPUf (Ela f(vi)) xvi) 


Qi od 
75 -x| 
u q—t| 3 
~ a1 L ( 26t V mr(r + 5) ) s 
< 2(1 |1 à ) +1 
D 26t V mr(r +1) ) 


<ir fi Ela 
q 
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This probability could be also closed to 0 if we choose the parameter 8./m and 
ô small enough. Therefore, the probability of decryption error of the LWE-based 
cryptosystem for general disturbance could be made very small, which leads to high 
security. 


Example 1 Lett = 2, q = 5,1 = 1,m = 1,r = 1, ô = 1073, v € Z is uniformly 
chosen at random, the disturbance E is a random variable with the distribution yg 


such that P(E =k} = fee for integer k and P{E = 0} = e ? with parameter 


B = 10, a € (—1,0, 1} is uniformly chosen at random. Then the probability of 
decryption error 


P(f I(Ea-- f(v)) Zv) 2 P ls (£a+ >) £ 7 
=. 2 n 0 lp 2g 2 1 
= 7 [sn] eo] ine |] 


1 1 
Sate PSOPEOSPIS #0) 


=1— P{E = 0} =1 —g 0001 < 107°. 


On the other hand, 


q—t 3 a9 
(1 (5 mera) +> n 


So it follows that 


-t] 3 
PU (Ea + f0) s v) « (I (5: zu) 


The inequality in Theorem 2 holds. 


Example 2 Lett = 2, q = 5,1 = 1,m = 1l1,r = 1, ô = 1074, v € Z is uniformly 
chosen at random, the disturbance F is a Laplace distribution variable with parameter 
à = 0.05 and probability density function f(x) = xe rounding to the nearest 
integer, a € (—1, 0, 1} is uniformly chosen at random. Similarly as Example 1, the 
probability of decryption error 


P(f I(Ea-- f(v)) Zv) 2 P É (gs EI £ »] 


kl 
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1 


1 E 10 4 
<1-P{E=0}=1- ane ~dx =e < 10. 


H 
2 


On the other hand, 


q-t 3 x: 
a (1 «( T — +> 10%. 
It follows that 
= 3 
P(f (Ea + f(v)) X v) < ai(1 ud CORE ) nd 


The inequality in Theorem 2 holds. 


3 Results and Conclusions 


In this work, we first introduce the LWE problem and LWE-based cryptosystem. We 
give a more precise estimation probability of decryption error based on independent 
identical Gaussian disturbances. The salient significance of our work is that for any 
general independent identical disturbances, we also give the estimation probability 
of decryption error using central limit theorem. The upper bound probability could 
be closed to 0 if we choose applicable parameters. It means that the probability of 
decryption error for the cryptosystem could be sufficiently small. Then we confirm 
that the LWE-based cryptosystem could have high security. 


4 Discussions 


41 Future Work 


Although we have reached our objective in this work, there are still many interesting 
works to study in this research area in the future. We will focus on the fully homo- 
morphic encryption (FHE)-based cryptosystem later, which is an application of LWE 
(Brakerski & Vaikuntanathan, 201 1a, b; Dijk et al., 2010; Gentry, 2009; Gentry et al., 
1999). Fully homomorphic encryption was known to have abundant applications in 
cryptography, but for three decades no plausibly secure scheme was known until 
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2009. To date, the FHE-based cryptography has more than three generations. The 
third generation FHE scheme based on LWE problem is proved that has some unique 
and advantageous properties (Gentry et al., 1999). It also remains some improvable 
technique which needs to be studied in depth. 
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On the High Dimensional RSA N) 
Algorithm—A Public Key Cryptosystem nun 
Based on Lattice and Algebraic Number 
Theory 


Zheng Zhiyong, Liu Fengxia, and Chen Man 


Abstract The most known public key cryptosystem was introduced in 1978 by 
Rivest et al. (1978) and is now called the RSA public key cryptosystem in their 
honor. Later, a few authors gave a simple extension of RSA over algebraic numbers 
field (see Takagi and Naito (2015), Uematsu et al. (1985, 1986)), but they require 
that the ring of algebraic integers is Euclidean ring, and this requirement is much 
more stronger than the class number one condition. In this chapter, we introduce a 
high dimensional form of RSA by making use of the ring of algebraic integers of an 
algebraic number field and the lattice theory. We give an attainable algorithm (see 
Algorithm 1) which is significant both from the theoretical and practical point of view. 
Our main purpose in this chapter is to show that the high dimensional RSA is a lattice 
based on public key cryptosystem indeed, of which would be considered as a new 
number in the family of post-quantum cryptography (see Peikert (2014), Pradhanet 
al. (2019)). On the other hand, we give a matrix expression for any algebraic number 
fields (see Theorem 2), which is a new result even in the sense of classical algebraic 
number theory. 
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1 Introduction 


Let Q, R, C be the rational numbers field, real numbers field, and complex numbers 
field, respectively, and Z be the integers ring. Let E C C be an algebraic numbers 
field of degree n, and R C E be the ring of algebraic integers of E. Suppose that 
A C Risanon-zero ideal(all ideals in this chapter are non-zero), then the factor ring 
R/A is a finite ring, we denote by N(A) the number of elements of R/A, which is 
called the norm of A, and denote by g(A) the number of invertible elements of R/A, 
which is called the Euler totient function of A. For any a € R, the principal ideal 
generated by o is denoted by aw R, then « is an invertible element of R/A if and only 
if (aR, A) = 1. It is known (see Theorem 1.19 of Narkiewicz (2004)) that 


1 


eu) = NA) T [a - 5» (1) 
I] N(P) 


where the product is extended over all prime ideals P dividing A. Moreover, ifa € R 
and (aR, A) = 1, then 
a?) = 1(mod A). (2) 


To generalize that RSA to arbitrary algebraic number fields E, we first show the 
following assertion. 


Theorem 1 Let P, and P» be two distinct prime ideals of R and A = P; P», then for 
any a € R and integer k > 0, we have 


o *)*! = (mod A). (3) 
Proof Leto € R. If (aR, A) = 1, then (3) follows directly from (2). If (@R, A) = 
A, then aR C A and o € A, (3) is trivial. Thus, we only consider the cases of 
(aR, A) = P and (aR, A) = P». If (cR, A) = P, then (eR, P5) = 1, by (2) we 
have 


o? = 1(mod Py). 


It follows that 
a^ * = 1(mod P), Vk € Z, k > 0. 


Therefore, there exists an element 6 € P» such that 
aA — 14 f. 
We thus have 
o **! — y .LgB, and o" *"?*! = (mod A), 


since of € A. The same reason gives (3) when (y R, A) = P}. 
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Table 1 RSA in the ring of algebraic integers 


RSA in the Ring of Algebraic Integers 


e Parameters: n > lisa positive integer, E/Q is an algebraic numbers field of 
degree n, R C E is the ring of algebraic integers of E. P, and P» 
are two prime ideals of R, A = Pj P», R/A is the factor ring, 

S is a set of coset representatives of R/A, g(A) is the Euler 
function of A, | < e < g(A) and 1 < d < (A) are two positive 
integers such that ed = 1(mod 9(A)) 

e Public keys: The ideal A and positive integer e are the public keys. 

e Private keys: The prime ideals Pj, P) and the positive integer d are the 
private keys 

e Encryptions: For any input message o € S, the ciphertext c is c = œf (mod A) 


d — ged = 


e Decryption: c° =a a(mod A), one can find plaintext o from c in S 


According to Theorem 1, one can easily extend the classical RSA over an algebraic 
number field as follows (also see Takagi and Naito (2015)), but it does not give the 
proof of (3)). 

Obviously, if n — 1, the above algorithm is the ordinary RSA. However, it is 
difficult to find the prime ideals in R and to construct a set of coset representatives of 
R/A yet. In Takagi and Naito (2015), the author supposed the ring R is a Euclidean 
ring, so that S can be constructed by Euclidean algorithm in R. The simplest way is 
to select an prime element o in R, so that the principal ideal œ R is a prime ideal. In 
algorithm I, we would precisely construct a set of coset representatives for the factor 
ring R/A by the lattice theory. Here we give an approximate construction of the set 
of coset representatives for factor ring R/A. 

If P C R is a prime ideal, then P N Z = pZ, where p € Z is a rational prime 
number. Since R/P is a finite field and Z/(pZ) C R/P, thus N(P) = p/, where 
f (1 < f x n) is called the degree of P. We write pR = P;' P --- P;*, where 
P — P, and P; are distinct prime ideals, e; is called the ramification index of P;. 
There exists a remarkable relation among ramification indexes and degrees (see 
Theorem 3 of page 181 of Ireland and Rosen (1990)) 


s 


ah =n. (4) 


i=l 


Let (01,05, ---0,)] C R be an integral basis for E/ Q, A = Pj P». Suppose that P; N 
Z = pZ and P N Z = qZ, then ANZ = pqZ, where p and q are two distinct 
rational prime numbers. 
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Lemma 1 Let 


$; = Sra osa epo acz isisa), (5) 


i=1 


Then S, covers a set of coset representatives of R/ A. Moreover, if the degrees of P, 
and P» are n, then S, is precisely an set of coset representatives of R/ A. 


Proof Since A = P, P2, P, O Z = pZ, and P) N Z = qZ, we have pqR C A, thus 
R/pqR maps onto R/A. To prove the first assertion, it is enough to show that S; isa 
set of coset representatives of R/ pq R. Since (o, 05, ... 0] is an integral basis and 


R= Zo + Za» + 4 Zar. 


Suppose thata = $7. , m;a; € R, writem; = aj pq + ri, whereO < r; < pq.Clearly 


n 


a= X ria; (mod pq R). 


i=l 


Thus every coset of pq R contains an element of S4. If 5 7 , ria; = 3; ,r;a; are in 


isl -i 
Sı and in the same coset mod pq R, then 


n 


ye (ri — rj) o; = O(mod pqR). 
i-l 
Since a; are linearly independent, it follows that 


ri 5 ri( mod pq) and r;=r;, 1<i<n. 


Next, suppose that the degrees of P; and P» aren, then N (Pj) = p" and N (P5) = q”, 
by (4) we thus have P; = pR, P; = qR,and A = pq R. The second assertion follows 
immediately. 


If one replaces S by S; in Table 1, then the successful probability of decryption is 
N(A)/p"q* — p^ nge”, (6) 
where fı and f» are the degrees of P, and P», respectively. 


We note that f| = f2 = n if and only if P; = pR and P; = qR; in this special 
case, we may give a numerical explanation. It is easy to see that 


(A) = e(pR)g(qR) = (p" — 1) (q" — 1). 


By Theorem 1, for any a € Z, we have 
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a" -D(0-0H = a(mod pg), keZ, k > 0. (7) 


Since S; is a set of coset representatives of R/A, a = Y; ,a;o; € S1, we may regard 
œ as a vector (a1, d2,...,4n) € Zu Let m = pq, 1 < e < (p" — 1) (q" — 1) and 
1 € d < (p" — 1) (q" — 1) such that 


ed = 1(mod (p" — 1) (q" — 1). 


Then for every input message o = (a1, d2,--- , an), we use the public key (m, e) 
and private key (p. q,d) to encryption and decryption for each a; in order, obvi- 
ously, these are the algorithms given by Takagi and Naito (2015), we consider these 
algorithms are just a simple repeat of RSA. 

The main purpose of this chapter is to show that the high dimensional form of RSA 
algorithm is a lattice based on cryptosystem in general. To do this, we first establish 
a relationship between an algebraic number field E and the Euclidean space Q". Let 
IR" be the Euclidean space which is a linear space over R with the Euclidean norm 
Ix], 


H 
x| = (xs) , Where x’ = (x1, 32, , Xn) € R”. (8) 
i=l 


We use the column notation for vector in R”, and x’ is the transpose of x, which is 
called a row vector in IR". Q” C R” is a subspace of IR". 

Without loss of generality, an algebraic number field E of degree n may be 
expressed as E — Q(0), where 0 is an algebraic integer of degree n and Q(0) is 
the field generated by 0 over Q. Let $ (x) be the minimal polynomial of 0, 


$(x) = x" — du ax" | — - -- — dix — bo € Z[x], (9) 


where all $; € Z. It is known that 
n—1 
b= on- [uo aeo]. (10) 
i=0 


We define an one to one correspondence between E and Q" by r: 


ao 
n—1 aj 
a-»'a(eE-5a-2|. eg" (11) 
i=0 : 
an-1 


and write t(a) = à or a d. In fact, t isa homomorphism of additive group from 
E to Q”, because of t (aœ) = at(q) for all a € Q. 
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As usual, the trace and norm mappings from E to Q are denoted by 
tr(a) =trg;g(@), and N(o) = Ngjgo(a). 
It is known (see corollary of page 58 of Narkiewicz (2004)) that 
N(aR)-|N(a), Vae R. (12) 


A full-rank lattice L is a discrete addition subgroup of IR", the equivalent expression 
for L is (See Micciancio and Regev (2009), Zheng et al. (2023)) 


L = L(B) = {Bx |x eZ'], (13) 


where B = [B,, 85. --- . Bn] e, € R"*" is an invertible matrix of n x n dimension, 
B is called a generated matrix of L. If L C Q”, we call L a rational lattice, if L C Z”, 
we call L an integer lattice. It is not difficult to see that every ideal of R corresponds 
to an rational lattice, we have the following. 


Lemma 2 Let A C R bean ideal and A Æ 0, then t (A) is a rational lattice. 


Proof Let (B1, £5, --- , Bn} C A bean integral basis for E/ Q, one has 
A = ZB, + ZB» +--+ + ZBn. 


It follows that _ = _ 

T(A) = ZB + ZB, Spese t Zp,, 
where f; = t(6;) € Q”. Let B = [B,, B5, --- , Bal, since (Bi, £o, +++ , Bn} is lin- 
early independent over Q, thus B is an invertible matrix, and we have 


t(A) = L(B) = {Bx | x e Z^). 
The lemma follows at once. 


Let L C Q” bea rational lattice, of which be corresponded by an ideal A in E for 
some suitable algebraic number field E, we call L an ideal lattice. Ideal lattice was 
first introduced by Lyubashevsky and Micciancio (2006) in the case of integer lattice, 
here we generalize this notation to the case of rational lattices. For more detailed 
discussion about ideal lattice, we refer to (Zheng et al., 2023). 

To give an attainable algorithm for high dimensional RSA, we require the follow- 
ing NC-property for the algebraic number field E. 


NC- property: E = Q(0) and R = Z[0], 14) 


where 
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Table2 Algebraic number fields with NC-property 


Algebraic Number Fields with NC-property 


e Quadratic Fields(see Proposition 13.1.1 of Ireland and Rosen (1990)) 
E- Q( Vd), where d € Z is a square-free integer and d — 2, 3(mod 4) 
e Cyclotemic Fields (see Theorem 2.6 of Washington (1982)) 
E = Q (En), where En = gil" isa primitive n-th root of unity 
e Totally Real Algebraic Number Fields (see Proposition 2.16 of Washington (1982)) 
E = Q(&y + £l), and E C R is the maximal real subfield of Q (£n) 


n-—l 
Z[0] — Vas laez isisa). (15) 
i=0 
Some of the well-known algebraic number fields satisfy the NC-property, we list 
a few as follows (Table 2). 


2 Ideal Matrices 


Suppose that 0 is an algebraic integer of degree n, ø (x) = x" — $, 1x"! — ... — 


$ix — bo € Z[x] is the minimal polynomial of 6, thus $ (x) is irreducible. Let 0 = 
09, 01,05, ++- , On—1 ben different roots of $ (x), the Vandermonde matrix of $ (x) is 
defined by 


V = V% = [6i 


aca , and A = det(V;) # 0. (16) 


According to $ (x), we denote the rotation matrix or adjoint matrix (see page 116 
of Manin and Panchishkin (2005)) by 


€ hx (17) 


where /,,_; is the unit matrix of n — 1 dimension. 


Definition 1 An ideal matrix H*(f) generated by the input vector f € IR" is defined 
by 
H*(f) = CF Hf, HEEL c R” (18) 


and all ideal matrices are denoted by 
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Mg={H*(f)|feR"} and Mo={H*(f)| f£ e Q"]. (19) 


Definition 2 For any two vectors f and g in R”, the -conventional product is 
defined by 


feg-H'(fy Q0) 
and the m-multi product is denoted by 


m 
——— 
—em 


f =f@f@---@f, meZ, m21. (21) 


Remark 1 If ¢(x) = x" — 1, then Hg is the classical circulant matrix (see Davis 
(1994)), and conventional product with circulant matrix was first proposed by Hoff- 
stein et al. (1998), which plays a key role in their cryptosystem. In Zheng et al. 
(2023), we generalized this definition with more general rotation matrices. 


By (18), H*(f) = 0 is a zero matrix if and only if f = 0 is a zero vector, and 
H*(f +8) = H* (f) + H*(g), then H*(f) = H*(g) if and only if f = g. Thus 
we may regard H* : R” — Mg as an one to one correspondence, which is also a 
homomorphism of Abel group. 

The main aim of this subsection is to show the Q” is a field under the ¢- 
conventional product and M% is also a field under the ordinary additive and product 
of matrices, both of which are isomorphic to the algebraic number field E — Q(0). 
To do this, we require some basic properties of the ideal matrices. 


Let 21, @2,--- , e, be the unit vectors of IR", namely 
1 0 0 
0 1 0 
e = ; ,€2 = ; $3759 en = š . (22) 
0 0 1 


Lemma 3 Let x be defined by (11), then we have 


H*(@) = H^, 1<k<n. (23) 


out 0<k<n-1 
Proof Tt (9%) = x41 follows directly from the definition of t. We use induction 
to prove H* (ex) = H*!. It is easy to see that H* (ei) = J, the unit matrix of n 
dimension. Suppose that H* (e; 1) = H*~*, for k > 2, note that @ = He, 4, it 
follows that 
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H* (€x) = [He-1, H?e 3, , H8] 
=H [eia Heg, H'"^e, 4] 
= HH*(&_}) = HH*? = H*-, 


The lemma follows immediately. 


Since $ (x) is the characteristic polynomial of H, by the Hamilton-Cayley theo- 
rem, we have 


$(H) =0, or H” = po + 1H +--+ n- H" 4. (24) 


Therefore, all the rotation matrices H* (k > 0) are the ideal matrices, especially, the 
unit matrix J, = H* (ej) is an ideal matrix. 

Let R[x] be the polynomials ring and IR(x)/($ (x)) be the quotient ring, where 
($ (x)) is the principal ideal generated by $ (x) in R[x]. We establish an one to one 
correspondence t between R” and IR[x]/($ (x)) by 


fo 
_ f j 
F=| . | eR" fe) = fot fixt fax"! e RII/(6G)) 
fazi 
(25) 
and write t (f) = f (x), ort-! (f(x)) = f. 
Lemma 4 For any f € R”, the ideal matrix H*(f) is given by 
H*(f) = f(A) = fol, + fiH c faa H". (26) 


Moreover, if F(x) € R[x] and F(x) = f (x)(mod $(x)), then f (H) = F (H). 
Proof Writing f = foe: + fi€o +--+: + f, 16,4, by Lemma 3, we have 
H*(f) = foH* (e) + A H*(e3) t fa H* En) 
= fol + fiH b fH" = f(A). 


Suppose that F(x) = f (x)(mod $(x)), by (24), we have f(H) = F(A) immedi- 
ately. 


Lemma 5 Let f and g be two vectors in R", and f (x), g(x) be the corresponding 
polynomials, respectively, then we have 


tF &g) = f GOgG) (mod $(x)). (27) 


Proof Since t is a bijection, it is suffice to show that 
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t (fe)go) = f @F. (28) 
Let g(x) = go + &() +++ gx" € R[x] /(O(~)), then 


xg(x) = Box +++ + Bn-1x" 
= £n—100 + (go +F $1 8n—1)X spes (2n—2 + idi 


It follows that 
t (xg) = Ht (g()) = Hg. 


More generally, we have 
t (x*g(x)) = Ht (gx) = H'g, O<k<n-1. (29) 
Let f (x) = fo + fix +-+- fax", then 
n-l n-l 
(gogo = » ho Ce) =) AH SS r= f oe. 
k=0 k=0 


The lemma follows immediately. 


fo 80 
= fi E $1 
Lemma 6 For any two vectors f = . e R”, g= . € R", we have 
Jai $n-1 


the following properties for ideal matrices: 
i H*f)H*(g) = H* (DH*(f): 
ü  H'(f)H*(g) = H*(H*(fY8); 
iti H*(f) = V; ' diag(f (60), f (60. f (6520) Va; 
iv det(H*(f)) = IT f (0): 
v Iff € Q", f 4 0, then H*(f) is an invertible matrix and 
(m) = H*0), 
where u(x) € Q[x] is the unique polynomial such that u(x) f (x) = 1(mod $ (x)) in 
Q[x]. 


Proof By Lemma 4, we have 
H*(f)H*(g) = f(A)g(A) = g(H) f(A) = H*(g) H'Cf). 


To prove (ii), we write H*(f)g = f & g, it follows that 
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H* (H*(f)g) = H*(f 9 8) = f(H)g(H) = H* (f) - H*@). 
By Theorem 3.5 of Davis (1994), we have 
H = V, ! diag (6, 61, - -- , n—1} Vo. (30) 
It follows that 
H*(f) = f(H) = Vj! diag (f (0). f (01), -+> f (05-0) Vo. 


Since diag {f (09) , f (1),--- , f (6, 1)) is a diagonal matrix, we have 


n-l 
det (H*(f)) = det (diag (f (6) . f 60. f 65-00 = [ [^ 90. 
i=0 
To show the last assertion, since f € Q”, f 4 0, and $(x) is an irreducible poly- 
nomial, thus we have (f (x), $ (x)) = 1 in Q[x], There are u(x) € Q[x] and v(x) € 
Q[x] such that 
u(x) f(x) + v(x)$ (x) = 1. 


By (29) and noting that t^! (1) = e; € R”, we have u & F = e|. It follows that 
We complete the proof of Lemma. 


Next, we discuss the algebraic number field E = Q(0) and recall t is an one to 
one correspondence between E and Q". 


Lemma 7 For any two elements a and B in E, we have 
t(aB) = t(w) & v(B) 2a f. (31) 
Proof Let B = By + B10 +--- + B, .,0"-!, where f; € Q, it is easily seen that 
OB = dou + (Bo + dis) 8 + +++ + (Buca + duis) 0", 
thus we have t(08) = Ht(B) = HB, and 
t (0*8) = H*r(f) = H*B, O<k<n-1. (32) 


Leta =ajp+aj0+---+a,_10""}, by Lemma 4, we have 
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n—1 n-1 
T (0B) = $ œt (0B) = 9 o, H*B = H*@B =€ 8f, 
k=0 k=0 


the lemma follows immediately. 


Let A = (aij), .., be a square matrix, and the trace of A is defined by Tr(A) — 
Ya ai as usual. The main result of this subsection is the following theorem. 


Theorem 2 Let E = Q(0) be an algebraic number field of degree n, and $(x) € 
Z[x] be the minimal polynomial of 0. Then the linear space Q" is a field under 
the $-conventional product, and all of the ideal matrices Mọ generated by rational 
vectors is also a field with the ordinary additive and product of matrices. Both of 
them are isomorphic to E, namely 


E S Q" S Mọ. (33) 
Moreover, let œ € E, tr(a) and N (a) be the trace and norm of a, then we have 
tr(o) = Tr (H*(@)) , and N(a) = det (H* (x)) : (34) 
Proof t : E — Q” given by (11), it is clearly that 
t(a+ B) — t(a) c 1(B), and t(aB) = r(a) & t(B). 


Thus Q” is a field under the ¢-conventional product and E ~ Q”. By Lemma 6, we 
have 


H*(@ + P) = H'(3)-- H'(B) and H* (x & B) = H*@)H* (B), 
thus M% is also a field and E = Q” = Mo. 

The main difficulty is to prove (34). We observe that 0 induces a linear transfor- 
mation of E/ Q by a — 0o, and the matrix of this linear transformation under basis 
{1, 0,02, ... a} is just H, namely 

0 (1,0,07,... ,0"7) = (1,0,0?,... , 0") H. 
By the definition of trace, we have 
tr(0) =Tr(H), and tr(6*) = Tr(H, ,1xkzn- I. 
Leta =a) + o0 4- --- 4-0 40"-! € E, it follows that 


n—1 


n-l n—i 
tr(o) = y tr (0%) = Z2 o; Tr (H*) = Tr (Y n) = Tr (H*(@)). 
k=0 i=0 k=0 
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To show that conclusion on the norm, leta? (0 < i < n — 1) be then conjugations 
of a in the smallest normal extension of Q containing E, where oO =a = ao + 
oO +--+ + o, .,0"-!. It is easily seen that 


n—1 
a - > 0,6, where 6) = 6 and 0 € i € n — I. 
k=0 


By property (iii) of Lemma 6, we have 


n—1 n—i 


N(w) = | |o? = [ [o 6) = det (H*@). 
i=0 


i=0 
We complete the proof of Theorem 2. 


The cyclic lattice in R” was introduced by Micciancio (2007), (also see Zheng 
et al. (2023)), which plays an important role in Ajtai’s construction of collision 
resistant Hash function (see Ajtai and Dwork (1997)). As an application, we show 
that every ideal in an algebraic number field corresponds to a cyclic lattice: 


Corollary 1 Let A C R bean ideal and A x 0, then t(A) C Q” is a cyclic lattice. 


Proof Suppose that o € A. Since 0 € R, then 0o € A. By (31), we have 
r (0a) = Ha € x1(A). 


Thus 7 (A) is a cyclic lattice. 


3 High Dimensional RSA 


In this section, we give an attainable algorithm for the high dimensional RSA by 
making use of lattice theory, and this algorithm is significant both from the theoretical 
and practical point of view. Suppose that the algebraic numbers field E satisfying 
the NC-property, then R = Z[0] is the ring of algebraic integers of E, the restriction 
of correspondence r gives a ring isomorphism from R to Z". Let Z(x) be the ring of 
integer coefficients polynomials and ($ (x)) be the principal ideal generated by $ (x) 
in Z(x), it is easy to see that R = Z[x]/(9 (x)). Let Mz be the set of ideal matrices 
generated by an integral vector, i.e. 


Mz ={H*(f)| f ez}. (35) 
Then the following four rings are isomorphic from each other 


Z[x]/($ (x)) = R= Z" = M}. (36) 


182 Z. Zhiyong et al. 


For any polynomial a(x) = œo + oix +++» + os ax"^! € Z[x]/($ (x)), the cor- 
responding algebraic integer is a = o 4- 040 +--+ 9s .10"-! € R, we write this 
isomorphism by 


eo) > a > ws Ha). (37) 


A ¢-ideal lattice means an integer lattice of which corresponds an ideal of 
Z(x)/ ($ (x)), it was first introduced by Lyubashevsky and Micciancio in (see also 
Zheng et al. (2023)), which also plays a key role in Gentry's construction for the full 
homomorphic cryptosystem (see Gentry (2009)), and Fluckiger and Suarez (2006) 
extended this definition to total real number field. 


Lemma 8 Let E be an algebraic numbers field with NC- property, R = Z[0] be the 
ring of algebraic integers of E. Then there is an one to one correspondence between 
ideals of R and the $-ideal lattices. Moreover, if a € R, then we have 


t(aR) = L (H*(a)). (38) 


In general, suppose that A C R is an ideal and A # 0, then there exist two elements 
a and p in A such that 


t(A) = L(H*(2)) + L(H*(B)). (39) 


Proof Since there is an one to one correspondence between the ¢-ideal lattices and 
the ideals of Z[x]/($ (x)) (See Corollary of Zheng et al. (2023)), by (36), the first 
assertion follows immediately. Let o € R, then aR = {ax | x e R}, by Lemma 7 
we have 


t(ax) = H*(a)x, where x = . EZ". 


Xn-1 


It follows that 
t(aR) = [H*(o)x | xe Zz =L(H*@). 


To prove (39), it is known that any ideal of R is generated by at most two elements 
(see corollary 5 of page 11 of Narkiewicz (2004) ), namely, A = aR + BR, then we 
have 
t(A) = t@R) + (BR) = L (H*@)) + L(H*(B)). 
To introduce an attainable algorithm for high dimensional RSA, we require some 
basic results from lattice theory. Let L = L(B) C IR" be a full-rank lattice, and the 
determinant of L is defined by 


d(L) = |det(B)]. (40) 
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Suppose that the generated matrix B — [bi b2, , bn] , bj € R” is the column 
vectors of B. Since fbi, bz,- bn} is a basis for R”, let B* = Izi, Dum E .b;] 
be the corresponding orthogonal basis, where b, = bı, and b; is obtained by the 
Gram-Schmidt orthogonal process in order. 

A basis B is called in Hermited Normal Form (HNF) if it is upper triangular, all 
elements on the diagonal are strictly positive, and any other elements 5;; satisfies 
0 x bij < bii. Itis easy to see that every integer lattice L = L(B) has a unique basis in 
Hermited Normal Form, denoted by HNF(L) (see Theorem 2.4.3 of Cohen (1993)). 
Moreover, given any basis B for lattice L, HNF(L) can be efficiently computed from 
B (see Cohen (1993), Micciancio (2001)). 


Proposition 1 Let L = L(B) and B = (bij)nxn be the basis in HNF. Then the cor- 
responding orthogonal basis B* is a diagonal matrix, namely 


B* = diag (bii, bz,- bas]. (41) 


Moreover, we have 


4(L) =| [ bu. (42) 
i=l 


Proof See Micciancio (2001). 


Definition 3 Let L = L(B) C R” be a full-rank lattice, and B* = [5i ba, b 


be the corresponding orthogonal basis, the orthogonal parallelepiped F (B*) is 
defined by 


F(B*) = Xab; |0 <x; < Land x; € rl. (43) 
i=1 


Proposition 2 Let L = L(B) C Z" be an integer lattice, B = HNF(L) be the basis 
in HNF and B* = diag (bij, b22, --- , ban} be the corresponding orthogonal basis, 
F (B*) is the orthogonal parallelepiped given by (43), then S is a set of coset repre- 
sentatives for the quotient group Z" / L, where 


S= F (B*) O Z” = fx = (X1, X2; ,Xn) | Yx EZ and 0€ xi < bi]. 
Proof See Sect. 4.1 of Micciancio (2001). 


Now, we return to the algebraic numbers field E — Q[0] (with NC-property). Let 
a, B € R be two algebraic integers, by Lemma 8, the principal ideal a R corresponds 
to the minimal $-ideal lattice L(H*(a)). Thus A = (a R)(BR) = af R corresponds 
to L (H* (œ & p)). 


Definition 4 For given a, P € R, t(a) = d, and t(f) = B, we denote the lattice 
Lap by 
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Lap = L (H*@ & P). (44) 


The HNF basis of Ly, is denoted by B,,, and the corresponding orthogonal basis 
is denoted by 
Bop = diag (bi, b», ES ba], (45) 


where b; € Z and b; 7 1. The parallelepiped is given by 
Sap = [G1,35, p Xn) EZ" | x; eZ and 0x x; < bi]. (46) 


Lemma 9 Leto € R, B € R, and A = aBR. Then Sa, g given by (46) is correspond- 
ing to a set of coset representatives of the factor ring R/A in the algebraic numbers 
field E with NC-property. 


Proof By Proposition 1, it is easy to see that 
Sup] =[ [di = [det (ae e )| = [det (H*@)| - det (47° ))] = 4 (Lens). 
i=l 
By Theorems 2 and (12), we have 
N(A) = |IN(@- B)| = IN (OI - |N(B)| = |det (H*(@))| - |det (H*(B))| = d (Lap) - 


It follows that N(A) = | Sa.8|- Since E satisfies NC-property, if o € R, then a = 
t(a) € Z”, hence a = f (mod A) in R, if and only if 


The lemma follows from Proposition 2 immediately. 


The main result of this subsection is the following theorem. 


Theorem 3 Let E be an algebraic numbers field of degree n with NC-property, 
o € R, B € R be two distinct prime elements, A= aBR, and La, p be the lattice 
given by (44). Then for any a € Z", k € Z, k > 0, we have 


q9 9 G.5*D = 3 (mod. Lap), (47) 


where E 

(a, B) = (|det (H*@))| — 1) (|det (H*(B))| — 1). (48) 
Proof Since E satisfies NC-property, à € Z”, thena = t~!(@) € R. By Theorem 1, 
we have 


g** = a( mod A). 


It is easy to see that 
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Table3 Algorithm I 


Algorithm 9.1: RSA in the Algebraic Numbers Field 


n > 1 isa positive integer, E/ Q is an algebraic numbers field with NC-property of 
degree n, R C E is the ring of algebraic integers of E, o € R, B € R are two distinct 
prime elements of R, A = ofR is a principal ideal of R, H* (Œ & £) is the ideal 
matrix corresponding to A, Log = L (H *(a Q B)) is the lattice generated by 

H*(@ & P), By,p = HNF (Log) is the basis of La, g in HNF, 


B% g = diag (b1, bo, --- , bn} is the corresponding orthogonal basis 
e Parameters: g(a, B) = (|det (H*(@))| — 1) (|det (H*(B))| - 1), 
Sa,p = {x = 01,35, x) E€ Z” |0< xi < bi}. 1 < e < g(a, B), 


1 € d < g(a, B), such that ed = 1(mod g(a, B)) 

Public keys: The rotation matrix H, the lattice L(By,g) = Lo,g and the 
positive integer e are public keys 

Private keys: Ideal matrices H*(@), H*(B), the basis H*(@ & f) of Ly. B 
and positive integer d are private keys. 


Encryption: For any input message a € Sg, the ciphertext c is given by 
c = a®° (mod Lup): 

Decryption: c94 = g8% = g8e(.8)-0 = qmod Log). One can find the plaintext 
a from c in S5,5 


g(A) = o(a R)g(BA) = (N (aR) — D)(N(BR) — 1) 
= (IN@)| — DQN(B)| — 1) 
= (|det (H*(@))| — 1) (|det (H*(B))| — 1) 
= q(a, p). 


By Lemma 8, we have 
T(A) e v(aBR) = L (H*@ Q B)) = Lag and v (ah) = g9 ERU, 
Therefore, (47) follows immediately. 


According to the above theorem, we may describe an attainable algorithm for 
high dimensional RSA as follows (Table 3). 


Remark 2 Ifthe class number hg = 1, in other words, R is a UFD, then the prime 
elements are equivalent to irreducible elements in R, and one can find prime elements 
a from a(x) € Z[x]/($ (x)) and a(x) irreducible. 
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4 Security and Example 


The classical RSA public key cryptosystem is nowadays used in a wide variety of 
applications ranging from web browsers to smart cords. Since its initial publication 
in 1978, many researchers have tried to look for vulnerabilities in the system. Some 
clever attacks have been found (see Bonech (2002), Coppersmith (2001)). How- 
ever, none of the known attacks is devastating and the ordinary RSA system is still 
considered secure. 

The security of high dimensional RSA depends on virtually factoring of an element 
of the algebraic integers ring R into product of of distinct prime elements. Factoring 
on R is much more complicated than factoring of a positive integer, and none of 
efficient method is known up to day, thus we consider the high dimensional RSA 
almost absolutely secure. 

To see the size of private keys, since det (H*(@)) = N (a), it may be extremely 
huge, for example, if v = p € Z, B = q € Z are prime numbers, then 


det (H*(@)) = N(w) = p", det (H*(B)) = q” 


and 


pla, B) = (p" — 1) (q" — 1), 


which is much larger than pq, the latter is the site of public key of the classical RSA 
cryptosystem. 

The lattice based on cryptography has been intensively studied for the past two 
decades. The GGH cryptosystem proposed by Goldreich et al. (1997) is perhaps 
the most intuitive encryption scheme based on lattices. The public key is a “bad” 
basis for a lattice, and Micciancio proposed in (2001) to use, as the public basis, the 
Hermite Normal Form B = HNF(L). The private key of GGH is an exceptionally 
good basis for L. The security of GGH relies on the assumption that it is difficult to 
find a special basis for L from a known basis of L. In this sense, we regard the high 
dimensional RSA as secure as GGH/HNF cryptosystem at least. 

Another number theoretic cryptosystem based on the lattice is NTRUEncrypt. 
The public key cryptosystem NTRU proposed in 1996 by Hoffstein et al. (1998) 
is the fastest known lattice-based encryption scheme, although its description relies 
on arithmetic over polynomial quotient ring Z[x]/(x" — 1), it was easily observed 
that it could be expressed as a lattice based on cryptosystem. NTRU uses a q-ary 
convolutional modular lattice(see Micciancio and Regev (2009), Zheng (2022)), its 
public key is also the HNF basis of L, and the private key is a special basis of L 
containing two secrete polynomials f(x) and g(x). Obviously, our algorithm I is at 
least as hard as solving NTRUEncrypt. 

Unfortunately, neither GGH nor NTRU is supported by a proof of security show- 
ing that breaking the cryptosystem is at least as hard as solving some underlying 
lattice problem; they are primarily practical proposals aimed at offering a concrete 
alternative to RSA or other number theoretic cryptosystems (see page 166 of Mic- 
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ciancio and Regev (2009)). However, the significance of this chapter is to show that 
the real alternative of RSA is the high dimensional RSA we present here rather than 
GGH and NTRU. 


Example 1 Finally, we give an example and see how to work the high dimensional 
RSA in a quadratic field. Let E = Q (Vd), d € Z bea square-free integer and d = 2, 
or 3 mod 4, thus E satisfies the NC-property. Let ôg be the discriminant of E, and 
it is known that ôg = 4d (see Proposition 13.1.2 of Ireland and Rosen (1990)). Let 
p € Z be an odd prime satisfying the following condition: 


p{4d, and 7 = d(mod p) is not solvable in Z. (49) 


By Proposition 13.1.3 of Ireland and Rosen (1990), we know that p is a prime element 
in E. 

According to Algorithm I, we select two large primes p and q of which satisfying 
(49). Let a = p and B = q, then 


=~ _ {P\ 3_/4 — [P0 «m _ (40 
s= (D). (1) m= (12): = a= (12) 


It follows that 


H*(@® B) = H*@H*@) = E 2] , Leg-L(H*(u& B) (50) 


Ses = fr = (X) €210 < xia < pa. (51) 
2 


It is easy to see that 
g(a, B) = (p — D? - D. (52) 


In this special case, the two-dimensional RSA may be described as follows 
(Table 4). 

We can similarly deal with the cases of Cyclotomic Fields. Letn = (m) for some 
positive integers m, £y = e?"!/^, E = Q(£,), and R C E be the ring of algebraic 
integers of E. Suppose that p € Z is a rational prime number, then p is a prime 
element of R if and only if (see Theorem 2 of page 196 of Ireland and Rosen (1990)) 


pim and p"? = 1(mod m). (53) 


Suppose that p € Z and q € Z are two distinct prime numbers satisfying (53), we 
obtain the lattice L(H* (p & q)) and an attainable algorithm in Q (ém). 
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Table 4 RSA in a quadratic field 


RSA in A Quadratic Field 


e Parameters: E = Q(Vd),disa square-free integer and d = 2 or 3(mod 4) 


10 

prime numbers of which satisfy (49). N = pq and x (N) = (p? — 1) (a? m 1) 

NO 

ON 
such that ed; = 1(mod x(N)) 

e Public keys: H, N and the positive integer e are public keys 


; : Od a 
the rotation matrix H — ( ) p, q are two large and distinct 


L = LB) istas 8 ( Jarsesuona ea «xam 


e Private keys: p,q and the positive integer d; are private keys 


e Encryption: Foranya — f ) eZ 


: c 
, the ciphertext c = lJjez 
a Pq 


c2 
given by c = a®*(mod L) 
e Decryption: c8% = a®“¢ = a(mod L). One can find the plaintext a from c in Ling 
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Central Bank Digital Currency A) 
Cross-Border Payment Model Based get 
on Blockchain Technology 


Mao Hanyu 


Abstract Since the turn of the twenty-first century, the growth of the globalized 
economy and trade has accelerated, and the cross-border payment system, which 
is an essential component of the international financial infrastructure, has played a 
significant role in the global economy and trade. However, traditional cross-border 
payments present risks and challenges, such as expensive processing fees, limited 
payment efficiency, information asymmetry in the trade process, and reliance on 
a highly centralized cross-border payment system. This chapter is based on con- 
sortium blockchain technology and utilizes Polkadot's Parachain, Relay chain, and 
cross-chain technologies as references; a scalable, high-efficiency, high-security, 
and privacy-protecting central bank digital currency cross-border payment model 
is designed. Analyzed the usage of hash digest technology and CoinJoin technol- 
ogy to avoid the tracing of transactions in order to protect privacy. The issuance of 
multi-country central bank digital currency or stablecoin anchored to a basket of fiat 
currencies is discussed as the currency in circulation in the model. Finally, the central 
bank digital currency cross-border payment development trend is summarized and 
forecasted. 


Keywords Payment model - Cross-border * Blockchain technology + CBDC 


1 Introduction 


Since 2020, the global digital transformation has been developing rapidly, and the 
era of central bank digital currency (CBDC) is accelerating, with China's digital 
currency—e-CNY leading the world. The launch of the e-CNY not only promotes 
the healthy development of China's digital economy but also benefits the RMB 
internationalization plan and speeds up the pace of RMB internationalization. At the 
same time, CBDC has especially significant advantages in cross-border payments, 
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which can effectively solve the problems of extended time, high cost, low efficiency, 
and low transparency faced by current cross-border payments. In addition, building 
a cross-border payment network system based on CBDC will also be a pivotal key 
to unlocking the opportunity to break the monopoly position of the US dollar and 
reshape the global cross-border payment system. Therefore, CBDC cross-border 
payments will inject new vitality into the rapid growth of our economy and will 
also play a pivotal role in establishing a fair and equitable international monetary 
settlement system. 

With the rapid development of CBDC, CBDC cross-border payments are becom- 
ing a research hotspot in the central bank's digital currency research area. According 
to a survey by BIS, more than 50% of central banks consider cross-border payments 
as one of the crucial reasons for accelerating the development of CBDC. Traditional 
cross-border payments suffer from high fees, low efficiency, information asymmetry 
in the cross-border trade process, and the potential financial risk of a highly central- 
ized cross-border payment system. The CBDC cross-border payment system, with 
the characteristics of high payment efficiency, low cost, and high transparency, is 
not only conducive to solving the current existence of cross-border trade friction and 
breaking the centralized cross-border payment system, but also conducive to elimi- 
nating the use of competitive currency devaluation, currency war, and other vicious 
behaviors between countries, promoting the peaceful development of financial mar- 
kets, and laying a moderately centralized cross-border payment system with a healthy 
market foundation for international trade (Yang, 2020). Therefore, a large number of 
central banks and international organizations have started to try to explore the appli- 
cation of CBDC in cross-border payments. On February 26, 2022, the United States, 
together with the European Union, the United Kingdom, and Canada, issued a joint 
statement announcing that Russia is banned from using the Society for Worldwide 
Interbank Financial Telecommunications (SWIFT) international settlement system. 
It undoubtedly accelerated the research of countries investigating the idea of bypass- 
ing SWIFT for cross-border transactions. 

Currently, the research on cross-border payment of central bank digital currency 
is still in the initial stage. There is a lack of in-depth research for a scalable and 
high-efficiency cross-border payment model, which leads to a lack of necessary 
theoretical research and essential technical support for its development. Therefore, 
it is significant to design a scalable, high-efficiency, CBDC cross-border payment 
model based on blockchain. 

CBDC cross-border payments issued by central banks have become a significant 
trend. In this chapter, we use Polkadot's Parachain, Relay Chain, and cross-chain 
technologies as references for the CBDC cross-border payment model, and we com- 
mit to designing a scalable, high-efficiency, highly secure, and privacy-preserving 
CBDC cross-border payment model based on consortium chain. 
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2 CBDC Cross-Border Payment Development Current 
Situation 


CBDC cross-border payments can be made in two ways: first, retail central bank 
digital currencies (CBDCs) in a given jurisdiction are available to people both inside 
and outside the jurisdiction, with no coordination between central banks; and second, 
central banks work together to establish access and settlement arrangements between 
different retail or wholesale CBDCs (Wan & Wu, 2022). CBDC cross-border pay- 
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ments can be divided into four quadrants: “same system and same currency", “same 
system and different currency", “same currency and different currency", “same cur- 
rency and different system", and “different currency and different system". Among 
them, “same system and different currency" and “different currency and different 
system" are the most typical scenarios for cross-border payments and will be a key 
research focus in the future. 

Atthis stage, for CBDC cross-border payments research, the following three mod- 
els are used to achieve cross-border and cross-currency interoperability, enhancing 
compatibility of CBDCs systems; linking multiple CBDC systems; integrating mul- 
tiple CBDCs in a single multi-CBDC (mCBDC) system (Auer et al., 2021). Models 
linking multiple CBDC systems include the Stella project of the European Cen- 
tral Bank and the Bank of Japan (2019); and the Jasper-Ubin project of the Bank 
of Canada (BOC) and the Monetary Authority of Singapore (MAS) (2019). Jura 
project for cross-border payment between Banque de France and Swiss National 
Bank. Integrating multiple CBDCs in a single mCBDC system mainly contains the 
Aber project of the UAE and the Central Bank of Saudi Arabia (2020); Dunbar, a 
joint project of the Monetary Authority of Singapore and the BIS (2022); and the 
Inthanon-LionRock project of the Bank of Thailand and the Hong Kong Monetary 
Authority (2020). In 2021, with the addition of the Digital Currency Institute of the 
People's Bank of China and the United Arab Emirates Bank, the project evolved into 
its third phase. It was renamed the mCBDC Bridge (mBridge) Project (Inthanon- 
LionRock to mBridge-Building a multi CBDC platform for international payments, 
2021). 

Recently, the CBDC projects Jasper, Ubin, and Stella have completed their exper- 
iments. All these projects continue the line “from wholesale payments to voucher 
payments to cross-border payments” (Yao, 2021). Thus, enabling cross-border pay- 
ments is the ultimate goal and an essential part of the CBDC research route. Moreover, 
the experimental results of these representative wholesale CBDC projects show that 
current technology and design solutions can support Real-Time Gross Settlement 
(RTGS) in terms of efficiency and can also realize Liquidity Saving Mechanism 
(LSM) in terms of functionality (Huang, 2022). Also, these projects show that cross- 
chain technology is a crucial issue for CBDC cross-border payments. Although 
CBDC cross-border payments have become a research hotspot both at home and 
abroad, most existing research scholars focus on the two fields of economics and law 
for CBDC cross-border payments, and the proposed CBDC cross-border payment 
model has not been sufficiently investigated on a technical level. 
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At the technical level, most research by central banks or international organiza- 
tions has focused on linking multiple CBDC systems and integrating multiple CBDCs 
inasingle mCBDC system. However, most projects are still at the experimental stage 
with the participation of only a few countries and lack a certain degree of scalability 
in practice. At the same time, the cross-chain technology used to link multiple CBDC 
systems, hash time-locked contract (HTLC), is limited in its application scenarios, 
where the two sides of a transaction need to establish N? magnitude of transaction 
channels between them, and the number of transaction channels grows in power as N 
increases. Therefore, the scalability of hash time-locked contract (HTLC) is deficient 
and may not be suitable for application to large-scale economies. While integrating 
multiple CBDCs in a single mCBDC system can avoid complex hash time-lock con- 
tract (HTLC) and improve payment efficiency, the establishment of privacy groups 
inevitably introduces multi-ledger-style behaviors and constraints that hinder the 
realization of transaction atomicity. Therefore, research on cross-chain technology 
and the introduction of effective privacy protection mechanisms to achieve transac- 
tion atomicity and improve transaction efficiency while ensuring transaction privacy 
are the focus of future research. 


3 Polkadot Technology Overview 


Polkadot is a scalable heterogeneous multi-chain technology that provides a more 
general cross-chain protocol. Any blockchain system compatible with Polkadot's 
cross-chain protocol will be able to complete cross-chain interconnection (Polkadot, 
2016). Polkadot is envisioned as a new form of blockchain “blockchain network" and 
one of the critical infrastructures of the future web 3.0. As shown in Fig. 1, Polkadot 
is completed with Parachain, relay chain, and bridge. It uses various Parachain tech- 
nologies to satisfy the needs of different applications. It uses Relay chain technology 
to unify the management of consensus security and data interaction, which can solve 
the scalability and isolability problems of current blockchain technology. 


3.1 Relay Chain and Parachain Technology 


The Parachain is a member blockchain of Polkadot that collects and processes trans- 
actions and transmits them to a Relay chain. Each participating Parachain has a high 
degree of autonomy and flexibility. Each Parachain can be designed and focused on a 
specific scenario as long as it follows the protocols set by Polkadot. The Relay chain 
is the core of the Polkadot network, responsible for maintaining the whole network's 
security, coordinating consensus among different Parachains, and forwarding cross- 
chain transactions between each Parachain. The consensus mechanism of the Relay 
chain uses an asynchronous Byzantine fault-tolerant algorithm to reach consensus. 
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Fig. 1 Polkadot architecture 
schematic 
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In order to maintain the relay chain, the Polkadot network establishes four roles: 
Nominator, Validator, Collator, and Fisherman. Nominators are a group of token 
(DOT) holders who have the authority to vote for the Validator. Validator nodes 
have the highest authority in the network, having the ability to create blocks for 
the whole network. They are elected by Nominator vote and can validate blocks 
and pack blocks after a sufficient deposit (TOKEN) is mortgaged in the system. If 
the validators perform their duties, they are rewarded for generating blocks. If the 
validators don't perform their duties, they are punished by having some or all of 
their deposit deducted. The collators are a group of nodes that collect information 
from the Parachain and package it for submission to the validators. They submit 
candidate blocks to the validators and assist them in creating valid blocks, which are 
rewarded with a fee. Collators will go and collect as much information as possible in 
order to get more fees. Fisherman is a relatively independent node in the system. It 
is only responsible for monitoring the system's illegal activities and reporting their 
detection. Then it will receive a substantial one-time reward. Moreover, a deposit 
is required to become a Fisherman, mainly used to prevent Sybil Attack by witches 
that waste the verifier's computing time and resources. 


3.2 Polkadot Cross-Chain Technology 


Cross-chain communication is the most critical part of Polkadot, as shown in Fig.2. 
Because of the relay chain's security guarantee for the whole system, transactions 
conducted on one Parachain can be transferred to another Parachain through the 
relay chain. As a result, cross-chain transactions on Polkadot are simpler and more 
efficient than other cross-chain methods. Specifically, each Parachain maintains an 
egress and an ingress transaction queue. The queue uses Merkle trees to ensure data 
authenticity. When a Parachain (A) initiates a cross-chain transaction to another 
Parachain (B), the transaction is pushed to Parachain A's egress queue. Then the 
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Fig.2 Schematic diagram of cross-chain transactions on Polkadot This figure from Polkadot White 
Paper Polkadot (2016) 


relay chain transfers the transactions in Parachain A's egress queue to Parachain B's 
ingress queue, which then processes the transactions in its ingress queue itself (Yuan 
& Wang, 2019). 


4 CBDC Cross-Border Payment Model 


This section will introduce a CBDC cross-border payment model based on con- 
sortium blockchain technology, referencing Polkadot's Parachain, Relay Chain, and 
cross-chain technologies. Figure 3 depicts a scalable, high-efficiency, high-security, 
and privacy-protecting CBDC cross-border payment model. 


Fig. 3 CBDC cross-border 
payments model schematic 


Parachain 


Validator swarm 
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4.1 Design of Parachain 


Every country is a Parachain in this model, and each Parachain is a consortium 
blockchain. The consortium blockchain is a permissioned blockchain, meaning only 
the internal designation of several nodes can upload, record, and read data. These 
nodes act as bookkeeper nodes, and they collectively decide to generate blocks. 
Using consortium blockchain can significantly improve the blockchain's operational 
efficiency and reduce network latency, all while ensuring the privacy of each trans- 
action's data. Therefore, Parachain in each country can be adopted in the form of 
consortium blockchains, which can achieve the purpose of improving efficiency and 
protecting privacy. 

Since Parachain has a high degree of autonomy and flexibility, each country's 
Parachain can be designed independently according to its own country's conditions. 
Therefore, the consensus mechanism of each country's Parachain can be chosen 
according to the country's reality, such as the mainstream consensus mechanisms 
applied to the consortium blockchain: Raft, PBFT, etc. The Parachain of each country 
can be divided into several nodes with different authorities according to the actual 
situation of cross-border payment in the country, including the central bank, trusted 
financial payment institutions, and regulatory agencies. As a result, three roles are 
established in the network of this model: Validators, Collators, and Supervisors. 

The work of Collators is to collect information on the Parachain, submit candidate 
blocks to the Validators, and assist the group of validators in creating valid blocks. 
They also have the authority to vote for the Validators. Consequently, the Collators 
can be commercial banks and trusted financial payment institutions in every country. 
Validators are the nodes with the authority to generate blocks and have the highest 
authority in the system. The Validators nodes are elected by vote of the Collators and 
are responsible for validating the blocks and packaging them. Each country's central 
bank or specialized agency can fill this critical role. Supervisors are the nodes that 
need to be responsible for regulating illegal activities in the system. Thus, it can be 
held by the regulator of each country. 

Taking China as an example, the nodes of Collators can be served by six state- 
owned commercial banks, including Industrial and Commercial Bank of China 
(ICBC), Agricultural Bank of China (ABC), Bank of China(BOC), China Construc- 
tion Bank (CCB), Bank of Communications (BCM), and Postal Savings Bank of 
China (PSBC), and trusted commercial banks and third-party payment institutions 
can be added in the future. Validators are the central bank of China, the People's 
Bank of China, and its affiliated institutions. Supervisors are mainly served by the 
Ministry of Commerce, the China Banking and Insurance Regulatory Commission, 
the National Audit Office, and other regulatory authorities. 
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4.2 Design of Relay chain 


In this model, the Relay chain is the same as the consortium blockchain, which con- 
tains the protocols of all Parachains, can recognize the transaction format of each 
country's Parachain, and can be responsible for coordinating consensus and for- 
warding cross-chain transactions between different Parachains. The Validator nodes, 
which the Collators vote on in each country, are also added to the relay chain and 
are responsible for packaging transactions and generating blocks. 

Specifically, the Collators in each country's Parachain first elect the Validators in 
charge of their Parachain, and the Validators are added to the Relay chain. After that, 
the Collators on each country's Parachain will collect the transactions into the blocks 
with a Noninteractive Zero-Knowledge Proof, which is used to prove that the father 
block of this child block is valid, and hand them over to the Validators in charge of 
their country's Parachain. The Validators from each country involved in this cross- 
border transaction form a team of Validators to validate the blocks in the order in 
which the Collators send them and then consensus out the Parachain blocks for that 
height. When the Validators of each country's Parachain involved in cross-border 
payments confirm that their country's Parachain has confirmed the transaction, the 
Validators group then routes the message to the Relay chain and generates the Relay 
chain blocks. In the next round, the Collators in each country's Parachain vote again 
to elect new validators and round this cycle. 


4.3 Cross-Chain Transaction 


The cross-chain transactions of the model are approximately the same as those of 
Polkadot. Each country's Parachain contains an egress transaction queue and an 
ingress transaction queue (there can be multiple exports and ingresses if the transac- 
tion volume is large). The Relay chain transfers transactions from the egress transac- 
tion queue at the source Parachain to the ingress transaction queue at the destination 
Parachain. 

The egress transaction queue contains a list of grids with routing information, 
each with a concatenated structure of exit submissions. Merkle tree proofs can be 
provided between verifiers of Parachains so that blocks of one Parachain can be 
proven to correspond to the egress transaction queue of another Parachain, guaran- 
teeing data authenticity. If the ingress transaction queue of a Parachain exceeds the 
block processing threshold, it is marked as complete on the relay chain, and no new 
messages are received until the queue is emptied. The Merkle tree is used to prove 
that the collector's operations in the Parachain blocks are trustworthy. 

For example, the flow of a cross-border transaction between China and Russia 
is as follows. When a Chinese Parachain launches a cross-chain transaction to a 
Russian Parachain, this transaction will first be pushed to the Chinese Parachain's 
egress transaction queue. Then the Relay chain will transfer this transaction from 
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the Chinese egress transaction queue to the Russian ingress transaction queue. Then 
the Russian Parachain will process the transaction in the ingress queue. This design 
can effectively guarantee the security of cross-chain transactions and significantly 
improve the efficiency of cross-chain transactions. 


4.4 Privacy Protection 


In Parachain, the blockchain ledger takes advantage of the irreversible nature of the 
hash algorithm and uses hash digests instead of transaction-sensitive information. At 
the same time, CoinJoin technology is used to obfuscate transactions and sever the 
relationship between the input and output addresses of transactions so that the origin 
and destination of transactions cannot be traced for privacy protection. 

In Relay chain, if other countries are not involved, the block can be generated by 
only the countries involved in cross-border transactions confirming the transactions. 
The relevant detailed information does not need to be authenticated by the nodes 
of other countries, which can prevent other countries from knowing the details of 
the transactions and can effectively protect the privacy of cross-border transaction 
information for each country. 


5 CBDC Cross-Border Payment Model Architecture 


The whole cross-border model is divided into four layers, as shown in Fig. 4. They are 
application layer, architecture layer, blockchain layer, and digital currency issuance 
layer. 

Application layer: This layer mainly faces users and can provide user identity 
authentication services, system access services, etc. The authentication technology 
verifies the user's identity through the authentication center to ensure the validity of 
the trader's identity. Users can access the system if they pass authentication. 

Architecture layer: This layer is composed of Parachain, which is designed by each 
country, and Relay chain, which is responsible for forwarding cross-chain transac- 
tions, as shown in Fig. 3. Parachain and Relay chain are consortium blockchains that 
are ideal for practical applications. Three trusted roles are established in the network 
of the model: Validators, Collators, and Supervisors, to help the whole system work 
more effectively. 

Blockchain layer: This layer consists of the core technical aspects of blockchain, 
such as Peer-to-Peer networks, Smart Contracts, Time stamps, and Consensus mech- 
anisms. Distributed ledger technology resolves the problem of storing, transfer- 
ring, and querying transaction information in cross-border payments. The consensus 
mechanism solves the agreement between validators on transactions and ledgers 
(Zhu, 2021). Resolving the problem of double payment through Digital signature 
and Time Stamp, Smart Contract technology can realize automatic accounting rec- 
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Fig. 4 CBDC cross-border 
payment model architecture Application Layer 
diagram 


Architecture Layer 


i 
T 
E 
z 


Blockchain layer 


Smart contract 
| 
Peer-to-peer networks Propagation mechanism Validation Mechanism 


Hash Function Merkle Tree 


Digital currency issuance layer 


onciliation and error handling in cross-border payments, ensuring that transactions 
are trusted and reliable. The smart contract automatically identifies and executes 
the actual conditions and matches the situations that arise to the relevant processing 
rules. This way, all information is recorded between the parties synchronously and 
cannot be tampered with by either party. It can also effectively prevent the loss of 
information due to technical failures (Huang & Luo, 2021). 

Digital currency issuance layer: It mainly corresponds to the issuance and redemp- 
tion of the digital currency used in this model, as well as the management and mainte- 
nance of this digital currency. Since the essence of building a cross-border payment 
system based on CBDC is to establish a regional economic association, regional 
economic cooperation is a prerequisite for CBDC to be recognized in cross-border 
payments. Therefore, a multi-country CBDC anchored to a basket of legal currencies 
is considered to be issued as the circulating currency in the model; or a stable cur- 
rency anchored to a basket of legal currencies is issued as the circulating currency. 
This digital currency should only be used for cross-border payment clearing between 
Parachains of individual countries. It cannot be freely exchanged or used outside this 
cross-border payment model between financial institutions on the parallel circulation 
chain. The intrinsic value and purchasing power of this digital currency can be deter- 
mined by each party's central basket of traded goods based on historical transaction 
volumes (or other forms). In this way, it can bypass the US dollar settlement and 
circumvent the constraints of countries’ foreign exchange reserves anchored by the 
US dollar without challenging the monetary sovereignty of national central banks 
(Huang & Luo, 2021). 
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6 Summary and Prospect 


This chapter utilizes Polkadot's Parachain, Relay chain, and cross-chain technolo- 
gies as references and is based on consortium blockchain technology; a scalable, 
high-efficiency, high-security, and privacy-protecting CBDC cross-border payment 
model is designed. The purpose of privacy protection is investigated by using hash 
digest technology and CoinJoin technology to obfuscate the input address and out- 
put address of transactions. The construction of a free-floating legal digital currency 
system bypassing U.S. dollar settlement is discussed so that multi-country CBDCs 
anchored to a basket of legal currencies can be issued, or stablecoins anchored to a 
basket of legal currencies can be issued as the circulating currencies in the model in 
order to contribute to the study of CBDC cross-border payments. 

With the increasing perfection of CBDC cross-border payment technology and 
the more mature development of cross-chain technology, the future is expected to 
form a regional-centric polycentric pattern. A new pattern of economic development 
in which “different currencies in the same system" are used within a region and “dif- 
ferent currencies in different systems" are used between areas in the future. Nowa- 
days, CBDC cross-border payment has become an international research hotspot. 
Although some countries are already experimenting with CBDC cross-border pay- 
ment, the security and scalability of its cross-chain still need time to be proven. 
Follow-up research can focus on the following two levels: Technically, the focus 
and difficulty of CBDC cross-border payments lie in cross-chain technology, so 
the research of more secure and efficient cross-chain technology is a hotspot for 
future research. At the same time, the research of new consensus algorithms that 
can be applied to blockchain cross-chain will also greatly improve the efficiency of 
CBDC cross-border payments. In terms of regulation, it is necessary to strengthen 
the supervision of CBDC cross-border payments; it is not only essential to identify 
the regulatory authority of CBDC cross-border payment but also to improve the legal 
study of CBDC cross-border payment. In addition, we can also learn from the regula- 
tory sandbox model and introduce a new model of the "Chinese regulatory sandbox" 
to balance risk and innovation. 
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LLE Based K-Nearest Neighbor R) 
Smoothing for scRNA-Seq Data giecik 
Imputation 


Yifan Feng, Yutong Ai, and Hao Jiang 


Abstract The single-cell RNA sequencing (scRNA-seq) technique allows single 
cell level of gene expression measurements, but the scRNA-seq data often contain 
missing values, with a large proportion caused by technical defects failing to detect 
gene expressions, which is called dropout event. The dropout issue poses a great chal- 
lenge for sScRNA-seq data analysis. In this chapter, we introduce a method based on 
KNN-smoothing: LLE-KNN-smoothing to impute the dropout values in SCRNA-seq 
data and show that the LLE-KNN-smoothing greatly improves the recovery of gene 
expression in cells and shows better performance than state-of-the-art imputation 
methods on a number of scRNA-seq data sets. 


Keywords LLE : scRNA-seq * Dropout issue 


1 Introduction 


Single-cell RNA sequencing (scRNA-seq) was first reported in Tang et al. (2009), and 
itis a high-throughput sequencing technology of the transcriptome at the single cell 
level, reflecting the heterogeneity between cells. The technology plays a significant 
part in many fields, such as developmental biology, microbiology and so on, and has 
gained a lot of attention in life science research (Kelsey et al., 2017; Stubbington 
et al., 2019). 

The advent of scRNA-seq technology provides great help for revealing hidden 
biological functions. However, scRNA-seq data is noisy and incomplete, containing 
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a large number of zero values. The zero values caused by failure of signal detection 
are called dropouts (Liu & Trapnell, 2016). The dropout event results from a failure 
of amplification of the original RNA transcript, and the generated noise may disrupt 
potential biological signals and hinder the downstream analysis. Hence it is a great 
challenge on how to distinguish the true biological zero and the false zero in SCRNA- 
seq data. 

A great number of imputation methods have been proposed to solve the dropout 
issues arisen in bulk RNA-seq data (Moorthy et al., 2019). For example, Kim et 
al. proposed a local least squares imputation method called LLsimpute (Kim et al., 
2004). This method uses least squares optimization to represent the missing genes as 
a linear combination of its similar genes. Aittokallio (2010) proposed a method based 
on fuzzy clustering and gene ontology to estimate the missing values in microarray 
data. However, these imputation methods may not be directly applicable to SCRNA- 
seq data as ScRNA-seq data is much more sparse than bulk RNA-seq data. 

In the design of imputation methods for scRNA-seq data, some researchers try to 
interpret the observed data through probability distribution model. Typical models 
assume that the scRNA-seq data follow Poisson or negative binomial distribution. 
The analysis of Ziegenhain (2017) and various studies show that in the absence 
of real expression differences, the mean variance relationship of genes or proteins 
closely follow Poisson distribution ( Grün et al., 2014). The randomness of single- 
cell sequencing technology leads to excessive zero values in single-cell data, and 
many studies include zero inflation to explain excessive zero values in sScRNA-seq 
data (Fan et al., 2016; Parekh et al., 2017; Pierson and Yau, 2015; Risso et al., 2017). 

MAGIC (Dijk et al., 2017) is a graph imputation method based on Markov affinity 
matrix. For a given cell, MAGIC first finds its most similar cell and aggregates the 
gene expression of these highly similar cells, so as to estimate the gene expression 
of those with dropout events and other noise sources. However, due to the sparsity 
of scRNA-seq data, the nearest neighbor in the original data may not represent the 
most biologically similar cells, which may add new bias to the data and eliminate 
meaningful biological properties. KNN-smoothing (Wagner et al., 2017) is developed 
by identifying the k-nearest neighbor of cells with average expression update to 
perform imputation. DrImpute (Gong et al., 2018) is also a smoothing method, which 
is designed based on the consistency clustering method of scRNA-seq data (Kiselev 
et al., 2017). In this method, Spearman and Pearson correlation coefficients are used 
to calculate the distance matrix between cells, while K-means is used to cluster the 
distance matrix within the expected cluster number range. These representatives form 
a class of smoothing based imputation methods. 

Model based imputation methods constitute a large proportion of imputation meth- 
ods for scRNA-seq data. scImpute (Wei et al., 2018) uses a mixed model to distin- 
guish dropout zeros from true zeros. However, scImpute assumes that each gene has 
a dropout rate, but it has been confirmed that the dropout rate of genes depends on 
many factors, such as cell type and RNA-seq protocols (Kharchenko et al., 2014), 
so the selection of dropout rate may need further discussion and research. SAVER 
(Mo et al., 2017) assumes that the original data follow Poisson distribution and form 
a prediction model for each gene through the observed gene count (UMI) and then 
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uses the weighted average of the observed count and the predicted value to restore 
the true expression of each gene in each cell. netNMF-sc (Elyanow et al., 2020) com- 
bines the network regularized nonnegative matrix decomposition with zero inflation 
processes in transcription count matrix. VIPER (Chen and Zhou, 2018) is based on 
a nonnegative sparse regression model, which predicts the cells to be imputed by 
actively selecting a set of sparse local neighborhood cells. In addition, VIPER mod- 
els dropout probability in the way of specific cell types and specific genes and infers 
all modeling parameters from the data using an efficient quadratic programming 
algorithm. 

Deep learning based imputation methods in the recent years have gained a lot of 
attention. AutoImpute (Talwar et al., 2018) is based on deep autoencoder and sparse 
gene expression matrix. DCA (deep count autoencoder) (Eraslan et al., 2019) is based 
on the negative binomial noise model, which can minimize the reconstruction error 
without supervision to learn the distribution parameters of specific genes, which 
can be applied to data sets of millions of cells. DeepImpute (deep neural network 
imputation) (Arisdakessian et al., 2018) imputes genes by constructing multiple 
sub-neural networks. The method uses dropout layers and loss functions to learn 
distribution in the data and constructs a predictive model, with imputation of missing 
data alone. 

Ensemble methods were proposed mainly for fully integrating the advantages of 
the available methods. Enimpute (Zhang, 2019) combines the basic results of eight 
different imputation methods (ALRA, DCA, DrImpute, MAGIC, SAVER, scImpute, 
scRMD, Seurat) and takes trimmed mean to get the robust results. SHARP (Wan et al., 
2020) is an algorithm based on ensemble random projection (RP) that is capable to 
deal with a scale of 10 million cells. 

Among the above methods, the smoothing based method mainly imputes the miss- 
ing values according to the expression of similar cells, which highly relies on distance 
measures to define similarity. The model-based method can better distinguish the real 
Zeros from the dropouts, but the results largely depend on the assumptions of the mod- 
els, which may lack generalization ability. Deep learning has high scalability and can 
process larger data sets, but at the same time, it requires too much time in training 
and learning steps, and the memory consumption is larger than other methods. In 
this chapter, we propose LLE (Locally linear embedding) (Zhou, 2016) based on 
KNN-smoothing for single cell data imputation. While dealing with real data, the 
global non-linearity of LLE as well as the property of maintaining the manifold 
structure can better restore the data. Compared with other methods, we believe that 
LLE-smoothing achieves better results. 
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Algorithm 1 K-nearest neighbor smoothing for UMI-filtered scRNA-Seq data 


Input: 

p. the number of genes. 

n, the number of cells. 

X,a p x n matrix. 

k, the number of neighbors to use for smoothing. 

d, the number of principal components to use for determining neighbors. 
Output: 

S,a p x n smoothed matrix. 


Input: procedure KNN-SMOOTH(p, n, X, k) 
S = COPY(X) 
steps = [loga (k + 1))] 
1: fort = 1 to steps do 
2: M = MEDIAN-NORMALIZE(S) 
3: F = FREEMAN-TUKEY-TRANSFORM(M) 
4: Y = LEADING-PC-SCORES(F, d) 
5: D = PAIRWISE-DISTANCE(Y) 
6: A = ARGSORT-ROWS(D) 
7T: k-step = MIND! — 1, k} 
8: for j= 1tondo 
9: for i = 1 to p do 


10: Sij =0 
11: end for 
12: end for 


13: for j = 1 ton do 
14: for v = 1 tok_step+ 1 do 


15: u=Ajy 

16: for i = 1 to p do 
17: Sij = Sij + Xiu 
18: end for 

19: end for 

20: end for 

21: end for 

22: 

23: return S 


2 Materials and Methods 


2.1 The K-Nearest Neighbor Smoothing Algorithm 


The k-nearest neighbor smoothing (KNN-smoothing) algorithm realizes imputation 
by aggregating information from similar cells based on the k-nearest neighbor (KNN) 
idea. The algorithm is formalized in Algorithm 1. Here, X;; refers to the expression 
of i'th gene and j'th cell of X. COPY (X) returns an independent memory copy of 
X. MEDIAN-NORMALIZE (X) returns a new matrix of the same dimension as X, 
in which the values in each column have been scaled by a constant so that the column 
sum equals the median column sum of X. FREEMAN-TUKEY-TRANSFORM (X) 
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returns a new matrix of the same shape as X, in which all values have been Free- 
man. Tukey transformed (FTT) Freeman and Tukey (1950) ( f G) 9 x Ax +1). 
LEADING-PC-SCORES (X, d) returns the principal component scores of the obser- 
vations in X (contained in the columns) for the first d principal components. 
PAIRWISE-DISTANCE (X) computes the pair-wise distance matrix D from X, 
here Dj; is the Euclidean distance between the i'th column and the j’th column of X. 
For a matrix D with n columns, ARGSORT-ROWS (D) returns a matrix of indices 
A that sorts D in a row-wise manner, i.e., Dia, x DjA, €. € Dja,, for all j. 


2.2 Locally Linear Embedding 


The k-nearest neighbor smoothing algorithm highly depends on the distance evalu- 
ation and hence, LEADING-PC-SCORES (X, d) is a critical step in the realization 
of the algorithm. Taking into consideration that PCA is a linear embedding method 
that may neglect the non-linear intrinsic property of scRNA-seq data, we propose 
LLE-based method for low-dimensional projection of scRNA-seq data. 

LLE is a dimensionality reduction method based on the concept of topological 
manifold. It assumes that each sample point and its neighbor sample point in high- 
dimensional space are approximately located on a hyperplane, so the sample point 
can be reconstructed by a linear combination of its neighbor sample points. Since 
LLE algorithm only considers the k-nearest neighbor information of each point, 
which is computationally efficient. Assume X = (xj, x2, ..., xw) € RP*, for each 
data point x; € RP*!, it can be represented by the linear combination of its k nearest 
neighbor: 


Xj = X wjixji (1) 


Wii Xli 
Woi X2i 

wi=|. y=]. (2) 
Wki XDi 


Minimize the following loss function: 


2 


N k 
arg min J Xj — J WjiXji (3) 
w 
i=1 j=l 


Solving the above formula, the weight coefficient can be obtained by 


208 Y. Feng et al. 
w = [wi, w2, ..., Ww] (4) 


where w; € Rgxn corresponds to N data points, (i = 1,2, ...N). 

After reducing the original data from D dimension to d dimension, x; — y;, the 
reduced representation can still be expressed as the linear combination of its k-nearest 
neighbors, and the combination coefficient remains unchanged, so the loss function 


can be written as: à 


N k 
WE yi — » G) 
i= j= 


where Y is the data located in the low dimensional space after dimensional reduction 
is obtained: 


Y=[y1, y2,---, yw] (6) 
We can rewrite the optimization objective as follows 
2 
N k 
Ow) = Y x — 3 yay 
i=l j=l 
2 


k 
»3 (xi = xj) Wii 


j=l (7) 


w; (X; — Nj)! (Xi — Ni) wi 


Regarding S; as the local covariance matrix, we have 


Si = (Xi — Ni)" (Xi — Ni) 
N 
(w) = > wi Siwi (8) 
i-l 
We can introduce Lagrange multiplier method 


N 
L (wi) = Dow) Siwi +A (w] 1 — 1) (9) 


i=1 


to get the optimal solution by derivation 
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aL (w; 
W) Sw; Al, = 0 (10) 
OW; 
8-H 
wi = (11) 
isi 


where 1, is the column vector of all 1 elements of k x 1, the local covariance matrix 
S; is a matrix of k x k, and its denominator is actually matrix $;, namely the sum of 
all elements of the inverse matrix, and its molecule is the column vector obtained by 
summing rows with the inverse matrix of S;. 

Finally, the optimization problem for the low dimensional embedding becomes 


2 


N k 
arg min y(Y) = 3 yi— Dwi (12) 
i= j= 
N N 
s.t. 2 yi = 0, 3 yiyi = Nlaxa (13) 
i=l i=l 
where 
Y =[y1, y2,---, YN] € RIIN (14) 
Let M denote 
M-(-—Wy (-W) (15) 


The optimization problem can be rewritten as: 


arg min tr(YMY"),s.t.YY? =I (16) 


It can be seen that Y^ is actually a matrix composed of the eigenvector of M, 
so we only need to take the eigenvector corresponding to the smallest d non-zero 
eigenvalues of M. 


3 Results 


3.1 Availability of Data 


The scRNA-seq data sets are available from Gene Expression Omnibus (GEO) 
database. Here, we use three data sets: Brain (Darmanis et al., 2015), Zeisel and Klein 
for method evaluation. Zeisel and Klein can be downloaded from GEO database with 
accession numbers GSE60361 and GSE65525 (Table 1). 
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Algorithm 2 Locally linear embedding neighbor smoothing for UMI-filtered scRNA- 
Seq data 
Input: 
p. the number of genes. 
n, the number of cells. 
X,a p x n matrix. 
k, the number of neighbors to use for smoothing. 
d, the dimensions of manifold learning . 
Output: 
S,a p x n smoothed matrix. 


Input: procedure LLE-smoothing(p, n, X, k) 
S=COPY(X) 
steps = [logo(k + 1))] 
1: fort = 1 to steps do 
2: M = MEDIAN-NORMALIZE(S) // a new p x n matrix 
3 F = FREEMAN-TUKEY-TRANSFORM(M ) // a new p x n matrix 
4: Y = LLE(F, d) // a new d x n matrix 
5: D = PAIRWISE-DISTANCE(Y) // a new n x n matrix 
6: A= ARGSORT-ROWS(D) // a new n x n matrix 
7T: k.step = MIN(2! — 1, k} 
8: forj—ltondo 
9: fori = 1 to p do 


10: Sij —0 
11: end for 
12: end for 


13: for j = 1 ton do 
14: for v = 1 to k-step + 1 do 


15: u = Ajy 

16: for i = 1 to p do 
17: Sij = Sij + Xiu 
18: end for 

19: end for 

20: end for 

21: end for 

22: 

23: return S 


3.2 Data Processing and Visualization 


The input of our method is a count matrix X with rows representing genes and 
columns representing cells. After logarithmic transformation and FTT transformation 
according to the process of Algorithm 4, X is mapped to a d-dimensional space by 
LLE. The Euclidean distance between each sample and its k nearest neighbors is 
calculated to form the distance matrix X, xn and then smoothed step by step from 1 
to k. We use t-distributed neighborhood embedding to visualize the data. 
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Table 1. Summary of data sets used for imputation 


Data size Cell clusters 
Klein 24175 * 2716 4 
Brain 16384 * 420 8 
Zeisel 4412 * 3005 9 


3.3 Performance Evaluation 


For evaluation, we use SC3 to cluster the imputed data to test the imputation effect. 
The Adjusted Rand index (ARI) is used to evaluate the clustering accuracy between 
the original cluster label of the data set and the cluster label of SC3. The results 
show that compared with other imputation methods, LLE-KNN-smoothing provides 
the best ARI in all three data sets of the experiment (as is shown in Table 2). For 
the nearest neighbor of parameter k, we can see that the ARI value of LLE-KNN- 
smoothing method is relatively high under different data sets, and when the value 
of parameter k changes, the clustering accuracy of LLE-KNN-smoothing changes 
slowly and remains stable (Figs. 1, 2 and 3). 

We use t-SNE visualization to analyze the advantages and disadvantages of various 
methods under different data sets. We find LLE-KNN-smoothing is better than other 
methods (Table 2) and that our method performs better between inter-class and intra- 
class (Fig. 4 shows the result of data set Brain). 


Brain 
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Fig. 1 Different k of the four imputation methods on data set Brain 
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Fig.2 Different k of the four imputation methods on data set Zeisel 
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Fig.3 Different K of the four imputation methods on data set Klein 


Table 2 ARI of different imputation methods using SC3 clustering results (k = 32) 


ARI Brain Zeisel Klein 

KNN-smoothing 0.8007 0.3187 0.8508 
KNN-KPCA 0.9505 0.8352 0.8649 
KNN-UMAP 0.8781 0.8272 0.8594 
LLE-KNN-smoothing | 0.9425 0.7917 0.9868 
MAGIC 0.9239 0.2723 0.3604 


4 Conclusions 


In this chapter, we have used different data sets to demonstrate that LLE-KNN- 
smoothing perform better than other methods. In future work, we will continue to 
study the selection method of parameter k, d, and other manifold learning meth- 
ods. Other work would be devoted to explore the effect of smoothing for differential 
expression analysis, gene set enrichment analysis, trajectory inference, etc. We antic- 
ipate that LLE-KNN-smoothing algorithm will perform well. 
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Fig. 4 t-SNE visualization of the reduced dimensions of the five imputation methods on dataset 
brain. a Raw data. b-f data after KNN-smoothing, KNN-smoothing (KPCA), KNN-smoothing 
(UMAP), LLE-KNN-smoothing, MAGIC 
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The Application of Time Series Analysis R) 
in the Fiscal Budget Variance of China giecik 


Guanhua Chen and Xingi Gong 


Abstract During the process of budget planning and execution, irregular behaviors 
will be reflected in the level of the difference between budgeted and actual figures 
(named budget variance). Considering that these two processes are both led by Gov- 
ernment Of China (hereinafter called GOC), the budget variance is widely used to 
evaluate the fiscal system. This chapter collects State General Public Budget data 
from 2000 to 2018 and analyzes their influence on budget variance. Then the fore- 
cast for budget variance is completed by modeling the budget execution and budget 
variance rate separately. The descriptive analysis and AIC (Akaike Information Cri- 
terion) contributes to decide the candidate model, the RMSE (Root Mean Square 
Error) on test data is used to select the final optimal model. The forecast shows that 
the extent of budget variance will be further controlled in 2011 and 2012, this chapter 
explains the result with fiscal theories to enhance the credibility of it and thereby 
provides a couple of policy advice on Chinese budget reform. 


Keywords Budget variance * Time series analysis * Fiscal science 


1 Introduction 


Budgeting is an important administrative process, which reveals both the range and 
direction of government action, as well as the effectiveness of the monitoring on 
government activities from National People's Congress (hereinafter called NPC) 
and private sectors (Chen, 2000). Since the tax sharing reform in 1994, China has 
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accomplished extensive and profound reforms in budgeting process. However, there 
is still some room for improvement, and according to Decision of the State Council 
on Deepening the Reform of the Budget Management System" issued in 2014, it 
is summarized as “the budgeting is not scientific enough, the budget system needs 
more supervision, the scale of financial carry-over and balance funds is large, and 
the budget data needs more transparency" etc. Irregular behaviors in budget planning 
and execution will be reflected in the budget variance. One of the characteristics of 
a scientific, transparent, and standardized budget system is that the final account 
income and expenditure are consistent with which planned by the budget. 

Since the 21st century, the level of budget variance of the Chinese government 
has experienced a rise and then a fall. Budget variance generally expanded year by 
year (Sun & Wu, 2012; Wang, 2009) from 2000 to 2011, peaking at a 15.8% of over- 
collection and 9% of over-spending. From 2012 to 2018, it has been significantly 
controlled. The average budget variance in revenue during this period was 6.39%, 
and that of spending was 4.29%. Figures of United Kingdom and the United States 
is instructive for further reform: the average public sector recurrent revenue budget 
deviation in the United Kingdom from 2001 to 2003 was—2.8% (Wang, 2009). In 
the United States, the average expenditure deviation from the fiscal budget was 2.1% 
in 2007 (Cui n.d.). China still has a long way to go in terms of reducing budget 
variance in comparison to countries mentioned above. 

The budget is naturally uncertain so a certain degree of variance is permitted. How- 
ever, large variance can lead to economic and institutional problems. For instance, 
excessive revenue adds to the burden on tax payers and impacts the market vitality, 
while excessive under-spending funds leads to inadequate provision of public goods 
(e.g., infrastructure) by the government, thus affecting social stability and People's 
livelihood. Institutionally, excessive budget variance implies the weak monitoring of 
the government. Therefore, it is important for Chinese government to speed up the 
budget reform and to establish a modern fiscal system to promote the modernization 
of national governance capacity. 

Budget variance contains great research value, so historical data can be analyzed to 
reveal the reform achievement in China's fiscal system. At the same time, forecasting 
this indicator can show whether there is still room for improvement and provide a 
reference for the direction of further reform. The main work of this chapter has three 
parts. Firstly, we analyze Chinese budget variance, clarifying the linkages between 
fiscal deviations and the general environment at home and abroad through statistical 
descriptions. Then the chapter builds a series of time series models and select the best 
one to predict the variance in the next two years. Finally, the chapter introduces fiscal 
science theory to explain the prediction and make some reasonable policy advice. 
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2 A General View on Budget Data 


2.1 Introduction to Concept and Data Source 


State General Public Budget is one of the four major components of Chinese fiscal 
system and can be divided into revenue and expenditure. The revenue part mainly 
includes tax income and non-tax income like confiscation income and transferred 
funds from the central government to local government. The expenditure part aimed 
at improving the people’s livelihood and maintaining national security, including 
spending on domestic defense, education, and infrastructure etc. The budgeting pro- 
cess can be divided into two stages, budget planning and budget execution. Former 
is the annual fiscal revenue and expenditure plan of the state, which is examined and 
approved by legal procedures, stipulating the sources of national revenue and the 
purpose of spending, reflecting the scope and direction of government activities. The 
latter is the annual implementation of budget plan, reflecting the actual economic 
activities on the state level. 
The calculation of budget variance rate is: 


. Budget Execution — Budget Planning 
Budget Variance Rate — - (1) 
Budget Planning 


The chapter uses the data of State General Public Budget Revenue and Expenditure 
ranging from 2000 to 2018 (see Table 1), which is obtained from the China Financial 
Yearbook. 


2.2 Descriptive Analysis of Budget Variance 


As seen in Fig. 1, the budget deviation is a common phenomenon from 2000 to 
2018 on a yearly basis. Most of the situation is over-collection and over-expenditure, 
under-collection only occurred in 2015 and under-expenditure only occurred in 2014. 
Over-collection rates peaks in 2007 (16.5%), 2011 (15.8%) and 2010 (12.4%), while 
over-expenditure rates peaks in 2011 (9%), 2001 (8.9%) and 2007 (7%). The average 
revenue budget variance rate from 2000 to 2018 is 6.4%, which is higher than that 
of expenditure which is 4.3%. There is a consistent trend between revenue and 
expenditure variance rate, and the correlation coefficient of them is 0.77, showing a 
strong positive correlation. 

Trend of budget variance is influenced by domestic fiscal policies as well as the 
global economic environment. From 1998 to 2004, due to China's two successive 
active fiscal policies, the revenue budget variance rate stayed around 7%, and finally 
peaked in 2007. In 2008 and 2009, due to the negative impact of the U.S financial 
crisis, the budget variance has all fallen back. The new round of fiscal expansion 
in 2010 and 2011 helped China and the world get out of crisis, however it also led 
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Table 1 State general public budget execution from 2000 to 2018 (billion yuan; 26) 


Revenue Expenditure 
Year Budget Execution | Variance Budget Execution | Variance 
rate (96) rate (96) 

2000 12337.77 13395.23 8.6 15136.23 15886.5 4.96 
2001 14760.2 16386.04 11.0 17358.3 18902.58 8.9 
2002 18014.83 18903.64 4.9 21112.98 22053.15 4.45 
2003 20501.32 21715.25 5.9 23699.62 24649.95 4.0 
2004 23570.34 26396.47 12.0 26768.64 28486.89 6.4 
2005 29255.03 31649.29 8.2 32255.03 33930.28 5.2 
2006 35423.38 38760.2 9.4 38373.38 40422.73 53 
2007 44064.85 51321.78 16.5 46514.85 49781.35 7.0 
2008 58486 61330.35 4.9 61386 62592.66 2.0 
2009 66230 68518.3 3:5 76235 76299.93 0.1 
2010 73930 83101.51 12.4 84530 89874.16 6.3 
2011 89720 103874.43 15.8 100220 109247.79 9.0 
2012 113600 117253.52 3.2 124300 125952.97 1.3 
2013 126630 129209.64 2.0 138246 140212.1 1.4 
2014 139530 140370.03 0.6 153037 151785.56 |-0.8 
2015 154300 152269.23 |-1.3 171500 175877.77 2.6 
2016 157200 159604.97 1.5 180715 187755.21 3.9 
2017 168630 172592.77 2.3 194863 203085.49 4.2 
2018 183177 183359.84 0.1 209830 220904.13 5.3 
Average 80492.67 83684.87 6.39 90320.05 93563.22 4.29 


Over-collection and over-expenditure is common phenomenon from 2000 to 2018. Over-collection 
rates peaks in 2007(16.5%), 2011 (15.8%) and 2010 (12.4%), while over-expenditure rates peaks 
in 2011 (996), 2001 (8.996) and 2007 (796) 


to a high budget variance (Chen & Lv, 2019). In the following years, the economy 
of China entered a new normal, the budget deviation began to decline due to the 
slowdown of GDP growth, under-spending appears in 2014 and under-revenue does 
so in 2015. The Ministry of Finance has been promoting structural tax cuts and fee 
reductions since 2014 and appropriately expanding the fiscal deficit. Corresponding 
to this policy, the spending budget variance rate exceeded the revenue budget variance 
rate for the first time in 2015, and remained higher than the latter in the next four 
years, growing slowly continuously. 

Budget variance was controlled in general after 2015 mainly because the new 
budget law came into effect in that year, deepening the fiscal reform. The reduction 
of the budget variance is the latest achievement in establishing a modern fiscal system 
in China, indicating that the budgeting process is progressing toward a more scientific 
direction. The fact that the average expenditure budget variance is smaller than that 
of revenue reflects that the budget review system is more stringent in expenditure 
management than revenue. The lack of systematic auditing and monitoring of over- 
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Fig. 1 State General Public Budget variance rate from 2000 to 2018 (76). The revenue and expen- 
diture variance rate peaked in 2007 and 2011 and were significantly controlled after 2014 


collection funds leads to the less frugal use of it, therefore over-collection can partially 
explanation for over-expenditure (Focus on budget deviations, 2008). 


2.3 Descriptive Analysis of Budget Execution 


As seen in Fig. 2, the budget execution is cyclical on a quarterly basis. In terms of 
revenue, it peaks in the second quarter and then falls back in the rest of the year while 
in terms of expenditure the largest percentage of annual spending occurs in the last 
quarter. The average revenue for the four quarters from 2000 to 2018 is 20789.60, 
23961.65, 19442.53, 19476.90 billion yuan, while that of expenditure is 17888.10, 
23212.16, 21828.40, 30602.28 billion yuan. The cause of spending surging at the end 
of the year is the lack of scientific budgeting and monitoring of the over-collected 
funds. The difference between revenue and expenditure is usually positive in the first 
part of the year and negative in the second part of the year. 


3 Overview of Time Series Analysis Techniques 


Time series are defined as ordered random variable sequences. The most common 
time series are discrete stochastic processes obtained at successively equally spaced 
time points. Time series describes the intrinsic structure of the series and the target 
of modeling is to make predictions. In brief, time series forecasting is the art of 
predicting the future by understanding the past. 
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Fig. 2 The State General Public Budget execution from 2000 to 2018 (season; billion yuan). 
Revenues and expenditures have a tendency to grow over time, while making seasonal changes 
in cycles of 4. The difference between revenue and expenditure shows that there is often a slight 
surplus in national income in the first half of the year, which is offset by a sudden increase in 
expenditures in the last quarter 


3.1 Decomposition of Time Series 


The time series can be decomposed into long-term trend variation T (a trend in a 
long period of time), seasonal variation S (regular variation due to seasonal changes), 
cyclical variation C (longer, more irregular cyclical variation) and irregular variation 
L (change caused by many contingent factors). The time series Y can be expressed as 
a function of the above four factors i.e. Y — F(T, S, C, L) like additive model (Y — 
T + S+ C + L) and the multiplicative model (Y = T * S x C x L). This chapter 
focuses on modeling T, S, L of the time series. 


32 ARIMA(p,d, q) 


In AR(p) the value at time f is a linear combination of the intercept, past p period 
Observations, and a random error obeying normal distribution. 


P 
X; = Vo + b» WiX1-i + &. (2) 
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Here X, represents the value at moment f, e; represents random error obeying 
normal distribution at moment t with a variance of o? which is mutually independent 
from X, and y; represents the weight of lag order i. Generally a sufficient and 
necessary condition for A R(p) to be stationary is that all roots of the characteristic 
equation falling outside the unit circle. Specially, the AR(1) model is a Markov 
process, and X, only relates to X;_;. For example, the sufficient condition for A R(1) 
to be stationary is |v;| < 1. Its mathematical properties when smooth are shown 
below, which shows that the auto-correlation of A R(1) model is long-tailed. 


$o 
E(X) = P 3 
(X) Emm Q) 
2 
Var(X;) = ——3;. (4) 
1-¢; 
E: 
se k=0 
m=; =p : (5) 
oiyve-1 k>0 
B 1 k=0 6 
im iPr- k»0 m 


In M A(q) model the value at t is a linear combination of the past q period random 
error, an intercept and the random error obeying normal distribution. 


P 
X, = Oo » Bieb 8 7) 


t 


Here 6; represents the weight of lag order i. The M A process of finite order does 
not require any precondition to be stationary. The mathematical properties of M A(1) 
are shown below, from which can be seen that the auto-correlation of the finite order 
model is truncated-tailed. 

E(X;) = 6o. (8) 


Var(X) = o? (1 + 012). (9) 


o7(1+6,7) k=0 
Ve = 4 070} k=1. (10) 
0 k>1 
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ARMA model combines AR(p) with M A(q) which can be written as: 


p q 
X, — do 3 dX 6t emi. (12) 


i=l i=l 


ARM A(1, 1) also requires |ġı| < 1 to be stationary, and the mathematical prop- 
erties of the stationary ARM A(1, 1) is shown below. Its auto-correlation coefficient 
is similar to the AR(1) model, and the partial auto-correlation coefficient is similar 
to the M A(1) model which are both long-trailed, decaying since two order lag. 


do 


E(X) = : 13 
(X) pud (13) 
1 4- 0j? 4- 2040 
Var(X,) — aT MIL. (14) 
1—4 
1-6)? +2410 
o2 + O° + zi 1 NT 
1—46 
= 6,)(1 0 
Yk m URS 1)( d 1) pai (15) 
1—¢, 
PiVK-1 k-1 
1 k=0 
+6,;)0 0 
oed (i me + $101) cg (16) 
1 -- 06; 4-294601 
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ARIM A(p, d, q) model introduces the differential operation to ARM A(p, q). 
In ARI M A(p, d, q), atime series is firstly transformed into a stationary one through 
d difference and then the ARM A model is established on it. Using B to denote the 
lagging operator, ARI M A(p, d, q) can be defined as: 


X = (1 By Xe (17) 


p q 
X, = dot 3 Gi Xi tert Dei. (18) 
i=1 i=1 
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33 SARIMA(p,d,q)(P, D, Q)s 


SARIMA model introduces seasonal trend to ARIMA, using B to denote the lagging 
operator, B° to denote s-order lag operator SARIMA(p, d, q) can be defined as: 


Y, = (I — B*)P(1 — BY X,. (19) 


p P q Q 
Y, = ġo + y» Y,—i + x Q; Yi si der 2 £p pa Et—si- (20) 
i-l i-l i=l i-l 


4 Modeling of Budget Variance 


4.1 Prediction of Budget Execution 


4.1.1 Exploratory Data Analysis 


The fiscal revenue and expenditure was increasing between 2000 and 2018, and the 
volatility was small. In Figs. 3 and 4 the auto-correlation decreases slowly, so a unit 
root non-stationary model can be applied. 

The unit root is detected by Adf test, after the first-order difference the serial auto- 
correlation decreases slowly, and the auto-correlation plot shows that there is likely 
to be a seasonal trend with a period of four for both fiscal revenue and expenditure. 
Revenue and expenditure items are often similar to those of the same period last year. 
For example, some expenditure items in education, defense, and public facilities field 
is fixed every year. Therefore, the seasonal effect is reasonable. Hence further make 
fourth-order differences on the data. 


A 
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Lag 


Fig. 3 The auto-correlation plot of revenue budget execution. The auto-correlation trails off, but 
the partial auto-correlation truncates after the third order 
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Fig. 4 The auto-correlation plot of expenditure budget execution. The auto-correlation trails off, 
but the partial auto-correlation truncates after the fifth order 


The data after fourth-order difference to extract seasonal effects are smooth non- 
white noise series, which meet the prerequisites of modeling. Then we take the data 
from 2000 to 2016 as training set and the data from 2017, 2018 as test set. 


4.1.2 Unit Root Non-stationary Model 


The seasonal parameter is decided as 4, establishing A R(3) based on the AIC. The 
SARIMA(3, 1, 0)(0, 1, 0)4 model shows that all the parameters are statistically sig- 
nificant with AIC of 1082.82. Using MA, the auto-correlation plot indicates a lag 
order of three, so SARIMA(0, 1, 3)(0, 1, 0)4 is established. The model shows that 
only coefficient ma, was not statistically significant and the AIC is 1079.42. Setting 
ma, to 0 and AIC becomes 1080.63, so we choose the first MA model. 

Trying SARIMA(3, 1, 3)(0, 1, 0)4, AIC is 1083.23 and coefficients arı, ar», ara, 
ma, maz are not statistically significant. Fix ar; to 0, the new model shows an AIC of 
1081.28 with ar», ma, maz being not statistically significant. Then fix ma; to 0, the 
AIC becomes 1079.38, with all remaining coefficients being statistically significant. 

Comparing all models based on AIC, it is concluded that SARIMA(3, 1, 3) 
(0, 1, 0)4, ar; = 0, ma, = 0 is the optimal unit root non-stationary model for rev- 
enue execution. 

For expenditure budget execution the seasonal parameter is also decided as 4. 
Using AR model first, deciding the order as 6 according to AIC, the results of 
SARIMA(6, 1, 0)(0, 1, 0)4 shows that all parameters are statistically significant and 
the AIC is 1132.89. Then consider MA model, SARIMA(0, 1, 5)(0, 1, 0)4 model 
shows that coefficients ma», mas, ma4, mas are not statistically significant and the 
AIC is 1136.75. Fix ma», maa, ma4, mas to 0 and it shows that all the remaining 
parameters are statistically significant and AIC becomes 1131.5. 

Considering ARIMA, SARIMA(6, 1, 5)(0, 1, 0)4 model shows that the AIC is 
1133.54 and coefficients ar), ara, ars, ma», and mas are not statistically significant. 
After fixing ars, ma, maa, maq to zero the AIC becomes 1129.38 and all remaining 
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coefficients are statistically significant. SARIMA(6, 1, 5)(0, 1, 0)4, ars = 0, ma, = 
0, maz = 0, ma4 = 0 is the optimal unit root non-stationary model by comparing 
AIC of all the models mentioned above. 


4.1.3 Fixed Trend Model 


Since the data has a increasing trend, fixed trend model is another candidate model. 
Considering the nonlinear trend, the regression of revenue to time shows that both 
one-order and two-order term coefficients of time are statistically significant, with 
an R-squared of 0.95. In Fig. 5, overlaying the estimated trend on the original series, 
it is found in Fig. 6 that the fitted values are consistent with the actual ones. 


Fig. 5 Fitting effect of revenue budget execution (billion yuan). The fitted curve is basically con- 
sistent with data 
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Fig.6 Fitting residuals of revenue (billion yuan). Obviously it is non-stationary time series 
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Fig.7 Fitting effect of expenditure budget execution (billion yuan). The fitted curve is basically 
consistent with data 


After removing the fixed trends, the remaining part are clearly non-stationary 
series and the auto-correlation plot of it shows that AR model can be applied. Accord- 
ing to AIC AR(5) is chosen whose AIC is 1173.7. Considering that the data have a 
strong seasonal correlation so the fourth-order difference is applied and the model 
SARIMA(1, 0, 0)(0, 1, 0)4 is built according to the auto-correlation plot, whose AIC 
is 1092.85. 

The Regression of expenditure to time shows that the first-order and the second- 
order coefficients are statistically significant, and the R-squared is about 0.9. Figures 7 
and 8 shows that the fitted values are consistent with data, and the residual meets the 
modeling requirement. 

A R(4) model is applied and the AIC is 1223.14. Considering the seasonal trend, 
fourth-order difference operation is done and the optimal model is obtained which 
is SARIMA(O, 0, 0)(0, 1, 0)4 whose AIC is 1142.3. 


4.1.4 Model Comparison and Testing 


The optimal unit root non-stationary model as well as the fixed trend model are 
selected based on AIC and their performance on test data is compared to make the 
final choice (see Table 2). It is found that the SARIMA model outperforms the AR, MA 
model with seasonal parameters on both revenue and expenditure. In terms of rev- 
enue budget execution, the fixed trend model outperforms the unit root non-stationary 
model on test data, although its AIC is slightly higher than it. In terms of expenditure 
budget execution, the fixed trend model outperforms the unit root non-stationary 
model on both training and test data. In conclusion, SARIMA(1, 0, 0)(0, 1, 0)4 
with a fixed trend is the optimal model for predicting expenditures execution, 
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Fig.8 Fitting residuals of expenditure (billion yuan). Obviously it is non-stationary time series 


Table 2 Comparison of revenue and expenditure forecasting models 


Model type Model AIC RMSE 
Revenue Unit root non-stationary SARIMA(3, 1, 0)(0, 1, 0)4 1082.82 |/ 
model 
SARIMA(0, 1, 3)(0, 1, 0)4 1079.42 |/ 
SARIMA(3, 1, 3)(0, 1, 0)4, 1079.38 | 4411.12 
arı = 0, ma; =0 
Fixed trend model SARIMA(1, 0, 0) (0, 1, 0)4 with 1092 4212.779 
quadratic trend term 
Expenditure | Unit root non-stationary SARIMA(6, 1, 0)(0, 1, 0)4 1132.89 |/ 
model 
SARIMA(0, 1, 5)(0, 1, 0)4, maz = 0, 1131.5 / 
mas = 0 , mas = 0 
SARIMA(6, 1, 5)(0, 1, 0)4, 1129.38 | 3242.246 
ars = 0, ma, = 0, ma3 = 0, ma4 = 0 
Fixed trend model SARIM A(0, 0, 0)(0, 1, 0)4 with 1142.3 4376.246 


quadratic trend term 


and SARIMA(6, 1, 5)(0, 1, 0)4, ars = 0, ma; = 0, maa = 0, ma4 = Oisthe optimal 
model for predicting revenue execution. Validating these two models, the residual of 
their prediction meet the white noise requirement. 


4.1.5 Forecast and Policy Advice 


The forecast shows that the revenue and expenditure execution will continue growing: 
the state general public budget revenue will reach 246,765.0 and 267,782.89 billion 
yuan in 2021 and 2022 while expenditure will reach 258,529.00 and 273,031.33 
billion yuan (see Table 3). Figure 9 shows that expenditure will still exceed the 
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revenue in the next two years, but the difference between revenue and expenditure 
will be significantly controlled. According to the model, from 2021 to 2022, it will 
be reduced to —1176.396 billion yuan and then to —5248.44 billion yuan. 

The excess of expenditure over revenue is the result of China's proactive fiscal 
policy in recent years. Premier Li Keqiang said in the 2020 government work report 
that “the current international situation is more unstable and uncertain, the world 
economic situation is complex and severe; the domestic economy is not yet a solid 
foundation for recovery, consumer spending is still constrained, investment growth is 
not strong enough, small and medium-sized enterprises and individual entrepreneurs 
have more difficulties, and the pressure on stable employment is greater. In this 
situation, the government still needs to give a hand", so the policy will continue as 
a heartening agent for market vitality. 

On the other hand, the decrease in the fiscal deficit indicates that China’s fiscal 
sustainability will be improved. Although China’s fiscal deficit ratio is always at a low 


Table 3 Forecast of the state general public budget revenue and expenditure 


Season Revenue Expenditure Difference 
2021Q1 59138.40 56870.24 2268.16 
2021Q2 68150.72 75785.07 -7634.35 
2021Q3 58292.60 62349.92 4057.32 
2021Q4 61183.32 63523.78 —2340.46 
202201 64315.78 60857.28 3458.49 
2022Q2 73379.50 79660.89 —6281.39 
2022Q3 63572.76 65720.32 -2147.56 
2022Q4 66514.86 66792.84 -277.98 
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Fig. 9 Forecast of revenue and expenditure budget execution from 2021 to 2022 (season; billion 
yuan). In the next two years, the expenditure execution will still exceed the revenue, while the 
difference between income and expenditure will continue to narrow 
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level compared to Japan, the U.S. and other developed countries, China still needs to 
be wary of the rapid expansion of government debt, which will rapidly climb once 
the government becomes debt-dependent, creating a “snowball” effect, i.e., “issuing 
new debt to pay off old one". In addition, Chinese government also has a hidden debt 
problem, which is not yet accurately counted, but is generally considered to be of a 
large scale (Liu & Huang, 2008). 

All in all, the future trend of fiscal revenue and expenditure reflects the trade-off 
between controlling fiscal deficit and restoring the vitality of the market economy, 
which is a reflection of the “no sharp turn" fiscal policy. 

By 2020, China has already implemented a massive tax and fee reduction policy. 
Finance Minister Liu Kun stated at a meeting of the NPC on March 5, 2021: "During 
the 13th Five-Year Plan period, tax cut and fee reduction is unprecedented, reaching 
a total of 7.6 trillion yuan, thus effectively promoting the development of market 
players and the real economy." However, tax cuts and fee reductions can also make it 
more difficult to balance fiscal revenue and expenditure. Therefore it is suggested that 
the finance department should make appropriate adjustments on policy, optimize the 
tax structure, achieve increases and decreases in different tax sections rather than large 
and general tax cuts. Atthe same time, the existing tax policy should be implemented, 
especially the “precise policy", which means the cuts must be applied to the most 
difficult areas and enterprises of small and medium-size in the industry. In terms of 
expenditure, we should keep the strategy of reducing expenditure, especially those 
for going abroad, vehicle purchase and operation and official reception. 


4.2 Prediction of Budget Variance 


Based on Eq. 1 the chapter will predict budget variance using the following formula 
on the grounds that the factors affecting it can be divided into those which affects 
the level of budget execution and those who affects budget variance rate, separately 
modeling them can help to better capture the serial correlation in data. This section 
will complete the modeling of the rate of budget variance. 


: . Budget Variance Rate 
Budget Variance = Budget Execution « . (21) 


Budget Variance Rate 4- 1 


4.2.1 Exploratory Data Analysis 


Over-collecting and over-spending are common between 2000 and 2018, and it can 
also be seen in Fig. 10 that the budget variance of revenue and expenditure are both 
non-stationary time series. After first-order difference the data passes the stationary 
test, thus a unit root non-stationary model can be applied. 
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4.2.2 Unit Root Non-stationary Model 


MA modelis applied and the lag order is set as 2 according to the auto-correlation plot. 
ARIMA(0, 1, 2) shows an AIC of 110.63 and the parameter ma; is not statistically 
significant. After setting it to zero the AIC decreases to 109.6, which is better than 
the previous model. 

In AR model, according to the auto-correlation plot, the lag order should be set 
to 2. ARIMAQ, 1, 0) shows an AIC value of 108.19 and the parameter ar, is not 
statistically significant. The AIC value rises to 108.63 after setting ar, to zero, so 
the previous model is chosen. 

ARIMA Q., 0, 2) shows an AIC value of 116.64. Setting the insignificant param- 
eters ma, and ma» to zero degrades the model to the AR model, so ARMA is not 
included as a candidate model. 

M A model is applied for the budget variance rate of expenditure after first-order 
difference. The lag order should be set as 2 according to the auto-correlation plot. 
The ARIMA(0, 1, 2) model shows an AIC of 90.89, with parameter ma, and maz 
being statistically significant. These two parameters were retained because the AIC 
increases after removing them. 

According to the auto-correlation plot ARIMA(, 1, 0) is set with AIC as 92.92 
and ar, being not statistically significant. After setting it to zero AIC decreases to 
92.37. 

Considering ARMA, ARIMA(Q, 0, 2) model shows an AIC of 93.83. Setting arı 
and ma» to zero, AIC decreases to 90.41. 


= revenue variance rate — 


Fig. 10 The state public budget data variance rate (76) 
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Table 4 Comparison of the model on budget variance rate 


Model AIC 
Revenue ARIMA(O, 1, 2), ma, =0 109.6 
ARIMA(2, 1, 0) 108.19 
ARIMA(O, 1, 2) 90.89 
Expenditure ARIMA(2, 1,0), ar; = 0 92.37 
ARIMA(2, 0, 2), arı, maz = | 90.41 
0 


4.2.5 Model Comparison and Testing 


The optimal model is selected from those mentioned above (see Table 4): 
ARIMA(2, 1,0) is the optimal model for revenue budget variance rate, ARIMA 
(2, 0,2), arı = 0, ma» = 0 is the optimal model for expenditure. The residual of 
both models pass the white noise test. 


4.2.4 Forecast and Policy Advice 


The revenue budget variance is forecast to be 1.0377 and 0.4684% in 2021 and 2022, 
and 3.685 and 3.884596 for those of expenditure. Figure 11 tells that the budget 
variance rate in the next two years will be further controlled compared with the 
previous years, while the phenomenon of over-collection, over-spending and the 
general trend that expenditure variance rate exceeding revenue variance rate will 
remain unchanged. Taking the forecast results into Eq. 21, the revenue variance in 
2021 and 2022 will be 2534.38 and 1248.44 billion yuan while expenditure variance 
will be 9188.20 and 10209.32 billion yuan. Compared with the data of 2017 and 
2018, the absolute value of budget variance remains unchanged, which indicates 
that it is under control, considering the continuous growth of national economy. The 
forecast results can be explained from four perspectives: 

The first factor is economy. Economic factors affect both budget variance rate and 
the level of budget execution: the faster the economy develops, the faster the budget 
execution grows, while the uncertainty of budgeting process also increases, leading 
to an increasing variance. In 2010s, China has experienced a shift from "speed" to 
"quality" in development priorities and started to consciously control its economic 
growth rate, while at the same time launching a supply-side reform. These macro 
factors have made it easier to estimate China's growth prospects and thus make it 
much less difficult to control budget deviations in the future. From a revenue per- 
spective, it is common practice for the Chinese government to set revenue budgets 
by adding a few percentage points to the current year's GDP growth rate (Sun & Wu, 
2012). Considering that governments at all levels have always been more inclined 
to leave room while budgeting (Wang, 2009), the over-collection phenomenon will 
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Fig. 11 The forecast budget variance rate (year; %). The revenue and expenditure variance rate 
will be further controlled in 2021 and 2022, and the rate of expenditure will be higher than that of 
revenue 


remain constant in the long run. From the perspective of expenditures, more fiscal 
spending items will occur during proactive fiscal policy, leading to more difficul- 
ties in controlling variance, which constitutes a partial explanation for the fact that 
expenditure variance will exceed revenue variance in the future. 

The second factor is the soft constraints of budget execution (Chen & Lv, 2019). 
This factor mainly affects the level of budget variance rate, and the stricter it is, the 
lower the rate will be. Before 2007, the over-collected funds were neither included 
in the supervision of the NPC nor excluded from the next year's budget. The inad- 
equate regulation becomes an incentive for over-collection (Ma, 2009). Similarly, 
the lack of supervision on under-spent funds will also encourage expenditure vari- 
ance to increase. In order to solve the soft constraint problem, the GOC has made 
unremitting efforts since 2007: in 2007, the central government established the Cen- 
tral Budget Stabilization and Adjustment Fund (CBSAF) to save and subsidize the 
over-collected funds to the short-collection year. The Budget Law of the People's 
Republic of China (2014 Revision) (hereinafter referred to as the new budget law), 
which was implemented in 2014, has increased the control over budgeting processes 
by increasing the transparency of the budget, establishing a system to control the 
inter-year budget and balance the budget across years; In the second revision of the 
Regulations on the Implementation of the Budget Law in 2018, the reform guideline 
of improving the budget performance management system and constructing a new 
pattern of all-round budget performance management was proposed. These moves 
target strengthening soft constraints, which will help restrain the budget variance in 
the long run. 

The third factor is related to fiscal management system which mainly affects the 
level of budget variance rate. As local governments rely on the central government's 
transfer payments, they are motivated to fight for more financial aids than they need, 
which encourages the phenomenon of "fighting more than spending" and oppor- 
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tunistic behaviors (Chen & Lv, 2019). Furthermore, local governments tend to raise 
their spending, especially in the last quarter, to fulfill budget tasks, which leads to 
irregular use of funds and other risks. These two factors contributed to the variance 
in fiscal expenditure in the past. However, the Regulations on the Implementation of 
the Budget Law, which were revised for the second time in 2018, point to improve 
the transfer payment process, clarify the fiscal relationship between governments 
by clarifying the types and scope of transfer payments, improving the evaluation of 
special transfer payments and regulating the process of fund transfer more strictly. 
These actions have significantly controlled the fiscal expenditure variance, which is 
reflected in the forecast. 

The fourth factor is external supervision, which mainly affects the level of budget 
variance rate. In China, the NPC is charged with overseeing the budget (Chen & Lv, 
2019), while the private sector also plays a supervisory role. Before the revision of 
the new budget law, the NPC's supervision was not sound enough, lacking profes- 
sionalism, while it was also difficult for the private sector to form a strong supervision 
on the budget due to the insufficient information disclosed by the government. In 
2014, the New Budget Law made efforts to strengthen the NPC's supervision and 
audit power over the use of budget funds, and the budget report was also required to 
be more detailed. In 2018, the Implementation of the Budget Law further requires 
more transparency, disclosing more data about government debt, agency operating 
expenses, government procurement and financial earmarked funds. Furthermore, it 
stipulated explicitly that special transfer payments should be disclosed by region 
and project, and expenditures by item. These measures strengthen the supervision 
function of the NPC and society, which is a reflection of the people's ownership and 
will continuously curb the budget variance rate. 

All in all, the decrease of budget variance rate in the next two years is the result 
of the continuous reform of China's fiscal budget system over years. It is one of the 
most important achievement of the GOC's modernization of national governance 
capacity. 

Based on the results of modeling and analysis, in order to further control the 
budget variance, we can focus on the following aspects. The first is to develop more 
scientific budgeting methods like adopting more mathematical models (e.g., time 
series techniques and uncertainty theory) and big data techniques (e.g., deep learning) 
to establish a more predictive budget planning system; The second is to strengthen 
the management of budget performance evaluation, such as implementing medium- 
term financial budgeting management and gradually integrating it into the existing 
budget performance evaluation. The third is to harden the soft constraints on budget 
planning and execution. For example, by eliminating the misuse of over-collected 
and under-spent funds through legislation. The fourth is to further improve the budget 
law, clarify the fiscal relationship between central government and local government, 
focus on controlling scale of transfer payments, especially special transfer payments, 
enhance the efficiency of capital flow. Finally, GOC should further implement the 
supervision over the budget planning and execution to create a more transparent 
budget system. For this purpose, the NPC's power must be reinforced and more 
details of fiscal data needs to be published. 
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5 Conclusion 


This chapter completes a descriptive analysis of budget data from 2000 to 2018 
in China, pointing out that the level of history budget variance is affected by the 
global economy and the macroeconomic policies of China. A series of budgetary 
management actions and fiscal reforms implemented by the Chinese government 
since 2014 were effective which is reflected in the fact that the budget variance has 
been well controlled in recent years. The chapter chooses unit root non-stationary 
model and fixed trend model to model for budget execution and budget variance rate 
data. The future budget variance in the next two years is calculated from the forecast 
of the two models. According to the prediction, the budget variance in 2021 and 2022 
will be further controlled, and this positive trend is the result of the combination of 
economic, soft constraints, institutional and regulatory factors. 

There is still much room for improvement. Theoretical forecast errors for 2021 
and 2022 may exist (Zhao & Wu, 2013) mainly because the impact of the epidemic on 
China's economy is not taken into account. There is also room for improvement in the 
combination of fiscal theory and modeling results. Other researchers can make further 
analysis, like forecasting the level of budget variance of a province or municipality, 
thus drawing conclusions with local characteristics (Lin & Ma, 2013). China's fiscal 
data are increasingly abundant, so the research difficulty of this topic will decrease 
over time. Finally, there are more time series methods that can be used to accomplish 
the tasks accomplished in this chapter, such as time series multiplicative models 
(Jiang & Cheng, 2018), more advanced mathematical tools such as uncertainty theory 
(Liu & Peng, 2005), data mining techniques (Qin, 2018) and Markov Chains (Hou 
et al., 2010; Li & Yi, 1997) can be used during modeling, which may lead to more 
accurate conclusions. 
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